Routing is a lot simpler than people think. When a box gets a packet, it checks the destination against all of its interfaces to see if the packet is for it (some systems are "strongly bound", which is a little like antispoofing; we can ignore that for this discussion). If the box does not have an interface with that IP (and if it has IP forwarding enabled), it sends the packet to the FIB for a decision on where to send it next.
A FIB is implemented as a data structure called a binary trie (pronounced "tree"; no, I don't know why the weird spelling). You start at the root node, which is the mask 0.0.0.0. Each node has a link to the next two nodes. As an example of the links, 0.0.0.0/0 has a link to 0.0.0.0/1 and 128.0.0.0/1. 128.0.0.0/1 then has a link to 128.0.0.0/2 and 192.0.0.0/2. And so on. Each node also has an optional interface to send the traffic out (always defined if there is a route for that address/mask combination), and an optional gateway IP address (defined if the route has a gateway). Once you have crawled the trie to find the longest mask match for your destination IP, you use that node's interface to send the traffic. That's really all there is to the routing decision.
If there no gateway IP, you ARP for the destination IP. If there is a gateway IP, you ARP for the gateway IP. That's all the gateway address actually affects.
This gets a little more interesting on systems which support multiple FIBs (think VRFs, Linux network namespaces, OpenBSD rdomains). Ultimately, you just have multiple tries, and a little metadata connected to the packet tells the OS which trie to crawl for that packet.
Some operating systems try to make things "easier" by hiding some of this from you. This almost invariably results in making things much more confusing (much like VSX itself is more confusing than it would have been if they had just exposed the VRF or now network namespace functionality directly). You may be able to specify your default route in terms of an interface and a gateway directly, or you may need to specify a smaller route for the gateway first so the "helpful" OS knows which interface to use. I have never seen an operating system where this would not work, though.
Let's say the VPN endpoint's public interface (eth1) is 2.3.4.5, your Internet firewall is VS 5, VS 5's public interface (bond2) is 2.3.4.2, and VS 5's interface towards the VPN (bond1) is 10.20.30.1. Add a local.arp entry on VS 5 telling it to claim to own 2.3.4.5 on bond2. Add a route on VS5 telling it 2.3.4.5 is out bond1 with no gateway. Add a route on the VPN endpoint telling it 0.0.0.0/0 is behind the gateway 10.20.30.1. You might also need to add a route telling the VPN endpoint that 10.20.30.1 is out eth1, but you might not.
When the client 6.7.8.9 tries to connect, the Internet router will ask where 2.3.4.5 is, the firewall will respond with its MAC, the packet will go to the firewall. The firewall doesn't have an interface with that IP, so it will consult its FIB and get the decision to send it out bond1 with no gateway (thereby telling it to ARP for the destination, which is still 2.3.4.5). The traffic will get to the VPN box, and the VPN box will send a response. Assuming it doesn't cache the request details to prepopulate the response, it will consult its routing table and won't find a longer match than the default route, which will tell it to go out eth1 with the gateway address 10.20.30.1. It ARPs for 10.20.30.1, the firewall responds, the packet goes to the firewall, and the firewall sends it along.
Years ago, due to some insane requirements, I had to build a firewall which could be inserted between servers and their default gateways (so bridge mode), but which could also NAT the traffic to maintain symmetry through load balancers (so not bridge mode). That firewall works a lot like what I described above.