Continue with my previous post Automate VPN connection and its TGW attachment, in this post I would like to share the solution for VPN failover via TGW attachment.
The key components in the solution are:
- Network Manager for Transit Gateway
The basic idea is:
- Register TGW to Network Manager, as Network Manager can monitor the tunnel status change of the VPN that are attached to the TGW. (Network Manager is a global service but only can be configured us-west-2 region).
- Setup EventBridge rules to catch the events (VPN-CONNECTION-IPSEC-UP, VPN-CONNECTION-IPSEC-DOWN) that are triggered by Network Manager. And send the events to a Lambda function.
- The Lambda function updates route entry in the TGW route table to use the alternative VPN connection as the destination.
In my example, I have two VPN connections: Primary and Failover. They are connected to different PoPs of a SaaS proxy platform. All VPCs send the Internet traffics to the SaaS proxy via the primary VPN TGW attachment (e.g static route: 0.0.0.0/0 -> primary vpn tgw attachment). When both tunnels within the primary vpn are down, the Lambda function will automatically failover to the failover vpn by updating the static route (e.g. 0.0.0.0/0 -> failover vpn tgw attachment). Later on, when both tunnels in the primary vpn come back up, the Lambda will fail it back the primary vpn.