Trying to fix my OpenVPN site-to-site link and due to the environment update I had to do some changes. The initial setup of the OpenVPN is here https://blog.voina.fr/edgerouter-dual-wan-hair-pin-multiple-networks-openvpn-site-to-site-vpn/
First of all there is a new EdgeRouter ER-8 that is directly linked to the main ISP I got this from Amazon.de see Ubiquiti ER-8 Netzwerk/Router
. In the future the plan is to have it link to another cable land-line ISP (the 3rd) so this will load balance also between two ISPs.
The current setup looks like:
Primary Site:
ER-8 (with load-balancing WAN1 and WAN 2):
– WAN 1: eth0 linked to the ISP 1 through a Hitron cable modem in bridge mode. Thus the ER-8 gets the IP from the ISP.
– WAN 2: eth1 not linked.
– LAN 7: eth7 to internal LAN 2
– LAN 11: eth2 internal LAN 11
D-Link DWR-921 LTE:
– WAN 1: LTE link to Mobile service ISP.
EdgeRouter POE:
– WAN1: etho, IP = 192.168.7.10 linked to EdgeRouter ER-8 eth7 with gateway 192.168.7.1
– WAN2: eth1, IP = 192.168.0.50 link to D-Link DWR-921 LTE eth4 with gateway 192.168.0.1
– LAN 2: switch0, all the internal LAN
Remote Site:
UPC Cable Modem:
– WAN 1: eth0 linked to the ISP 1
EdgeRouter Lite:
– WAN 1: eth0, link to UPC Cable Modem eth1 with gateway 192.168.0.1
– LAN 9: eth1, local service LAN
– LAN 10: eth2, local management LAN
Now I can safely apply what ubnt-stig suggested on http://community.ubnt.com/t5/EdgeMAX/Dual-WAN-failover-OpenVPN-site-to-site/m-p/1524860/highlight/false#M104986
I will have to define in fact a policy based routing for my OpenVPN site-to-site connection.
STEP 1: Define new routing tables with static routes for each load-balanced WAN
The problem with load-balancing with failover is that sometimes is counter intuitive how it works. If not specified when the failover occurs a new routing table is forked with some default values copied from the main table. As a result if you have DHCP WANs with some default routes you may end up with missing or wrong routes.
The safest way is to statically specify the default route for each WAN. Of course this implies that both your WANs have in fact static IPs and your default gateways are also static IPs. Sadly if one of your WAN IPs is obtain by DHCP there is still no valid solution as firmware 1.8.
Define two new routing tables:
– table 1 : that will be the routing table for WAN 1
– table 2 : that will be the routing table for WAN 2
In table 1 we add the default route for eth0
configure set protocols static table 1 route 0.0.0.0/0 next-hop 192.168.7.1 commit save exit
In table 2 we add the default route for eth1
configure set protocols static table 2 route 0.0.0.0/0 next-hop 192.168.0.1 commit save exit
STEP 2: Define a firewall modify policy to select a different routing table for load-balace “group G”.
Change the routing table for load-balance
configure set load-balance group G interface eth0 route table 1 set load-balance group G interface eth1 route table 2 commit save
List the new load-balance configuration
ubnt@ubnt# show load-balance group G { interface eth0 { route { table 1 } route-test { initial-delay 60 interval 10 type { ping { target 8.8.8.8 } } } } interface eth1 { failover-only route { table 2 } route-test { initial-delay 60 interval 10 type { ping { target 8.8.8.8 } } } } sticky { dest-addr enable dest-port enable source-addr enable } }
STEP 3: Add a static route to the remote LAN 9
Add the route to the remote site also on main table as static route. This is important because we have to instruct the router that this network is accessible through the OpenVPN vtun0 interface.
configure set protocols static interface-route 192.168.9.0/24 next-hop-interface vtun0 commit save
STEP 4: Add the static routes to the WANs with different distance
Initially both my default routes were obtained by DHCP. Because of that two default routes were still added there so I was getting lost packages while pinging the remote networks. Even after I switched to static IPs and defined by hand the static routes I was getting the same results.
In fact even by pending the external WAN1 IP I got lost packets. This means that by default packets were routed on both eth0 and eth1
ubnt@ubnt:~$ show ip route Codes: K - kernel, C - connected, S - static, R - RIP, B - BGP O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 > - selected route, * - FIB route, p - stale info IP Route Table for VRF "default" S *> 0.0.0.0/0 [210/0] via 192.168.7.1, eth0, weight 1 S *> 0.0.0.0/0 [210/0] via 192.168.0.1, eth1, weight 1 C *> 10.99.99.1/32 is directly connected, vtun0 C *> 10.99.99.2/32 is directly connected, vtun0 C *> 127.0.0.0/8 is directly connected, lo C *> 192.168.0.0/24 is directly connected, eth1 C *> 192.168.2.0/24 is directly connected, switch0 C *> 192.168.7.0/24 is directly connected, eth0 S *> 192.168.9.0/24 [1/0] is directly connected, vtun0 S *> 192.168.10.0/24 [1/0] is directly connected, vtun0
Then I tried to delete the default routes and leave only the default routes from table 1 and table 2.
This still did not work because from some reason the router needs a “route of last resort”.
Then I tried to add only the route to WAN 1 as the route of last resort. Somehow with this case the load-balancer was unable to verify now that WAN 2 is up. Strange that the route test ping does seem to ignore table 2 and wants to go through the main routing table. Because in the main routing table there was no route to the WAN 2 it was failing.
The miracle solution was to define static routes to both WANs in the main table but with different distances. By defining route to WAN 1 with distance 1 and route to WAN 2 with distance 200 (allowed values are between 1 and 250) problem is solved.
– all the packets that were routed by the main table will go through the route with the smaller distance. I am no longer getting lost packets for VPN or by ping to the WAN 1
– packets that need to go through WAN 2 explicitly will be able to do so bacause there is a static route to WAN 2.
configure set protocols static route 0.0.0.0/0 next-hop 192.168.0.1 distance 200 set protocols static route 0.0.0.0/0 next-hop 192. distance 200 commit save exit
The final routing configuration looks like:
ubnt@ubnt# show protocols protocols { static { interface-route 192.168.9.0/24 { next-hop-interface vtun0 { } } route 0.0.0.0/0 { next-hop 192.168.0.1 { distance 200 } next-hop 192.168.7.1 { distance 1 } } table 1 { route 0.0.0.0/0 { next-hop 192.168.7.1 { } } } table 2 { route 0.0.0.0/0 { next-hop 192.168.0.1 { } } } } }
There are no changes to the remote site.
STEP 5: Apply the changes
Reset the load balance with ubnt-stig trick
sudo pkill ubnt-util
Reset the OpenVPN tunnel to apply the new configuration:
reset openvpn interface vtun0