High Availability and Failover
Last updated
Last updated
Currently we support the following HA/failover scenarios:
We support active-active configurations with multiple gateways for a single VPN instance or location. Since our gateway uses a vanilla kernel WireGuard®, there are multiple approaches for implementation.
Please also see documentation of Creating a New VPN location where each location setting has information regarding high-availability.
To have multi-gateway for one location setup, you need to deploy the gateway on each server.
If you already have a gateway deployed, and want to add new gateways for the location, go to VPN Overview -> Click: Edit Location Settings (in top right corner), then choose the location you want to add the gateway, and follow instructions for deployment:
After each gateway deployment all gateway will have the same configuration and will bind to the defined port in the location Gateway Port.
The only thing left to do is to point your traffic to those gateway, which can be acomplished with various HA scenarios:
floating public IP - if you choose this scenario, please remember that the IP must be the IP specified in the Location Gateway Address
proxy/load balancing - also remember that the proxy must be configured with the Gateway Address and Gateway Port
All gateways that are successfully connected for the location are displayed under the Location in VPN Overview, here is an example for two gateways:
For VPN Locations without MFA - it's persistent until the system reboot - even if the gateway will not work - as the gateway configures WireGuard "in kernel".
For VPN Locations with MFA, this depends on the Peer Disconnect Threshold (seconds) setting in the VPN Location settings. This setting specifies that if the peer is inactive for (defined seconds), the gateway should remove it from the configuration. Therefore, if the proxy/core is not operational, MFA authentication will fail, and the peer will not be added if it is disconnected.
The core service handles gateway states as well as core connects to the proxy. Since proxy serves HTTP based protocol communication and should be in the public Internet, it needs to be secure, thus core connects to the proxy.
This way core can be in an Intranet network segment and proxy can be in DMZ, making Core completely cut-off on firewall from the Internet (you only can have only outgoing firewall rules from Intranet allowing only for core to connect to proxy).
So High Availability for core and proxy gets complicated, with multiple proxies core needs to manage those connections. We already have most of the code for that ready, but it's not yet production ready.
We recommend to deploy them on a failover solution - like on a kubernetes cluster (even small one - like mini-kube) . This way, kubernetes manages: healthecks and does failover. You can have cluster N-nodes and if any VM/node with Core/Proxy goes offline or health checks fail - it's migrated to a new node.
Also failover is good eanough now, since:
gateways are fully active-active HA,
even if they fail, peers are fully (or with configuration) persistent.