NSX Edge nodes can be virtual appliances or Bare metal instances. NSX-T Edge nodes are service appliances with pools of capacity, dedicated to running network services that cannot be distributed to the hypervisors.
The NSX-T Edge appliance provides routing services and connectivity (North-South) to networks that are external to the NSX-T environment. An NSX Edge is required if you want to deploy a tier-0 router or a tier-1 router with stateful services such as NAT, DHCP Server, Edge Firewall, etc.
NSX Edges provides routing services and connectivity to networks that are external to the NSX-T deployment. You use an NSX-T Edge for establishing external connectivity from the NSX-T domain by using a Tier-0 router using BGP or static routing.
How to Replace Faulty NSX-T Edge Node in Edge Cluster
Recently I came across a situation in one of my production environments, The Customer has a Tier-0 gateway and edge cluster associated with it. My Tier-0 is Active -Passive node. We noticed an issue of all TCP packets failing when the traffic goes through one of the edge nodes.
To identify the issue, We have manually powered off the active edge node of the Tier-0 gateway. You can take a look at my article How to Identify the Active Edge Node of NSX-T Tier-0/Tier-1 Gateway
Other Edge node in the edge cluster became active and everything started working. No issues were found. This behavior is only observed when the faulty edge node becomes Active. We did involve VMware support for the same. They also suggested replacing the faulty edge node with the new node.
In this article, I will explain to you the detailed step-by-step procedure to replace the faulty NSX-T edge node in the Edge cluster.
Note: If the NSX Edge node to be replaced is not running, the new NSX Edge node can have the same name, management IP address, and TEP IP address. If the NSX Edge node to be replaced is running, the new NSX Edge node must have a different name, management IP address and TEP IP address.
In my case, My faulty edge node is edgenode-02a which is currently up and running. So I have deployed the edgenode-03a with the same configuration as edgenode2a but with a different name, Management IP, and TEP IP (Auto assigned by IP Pool).
You can also verify the edge nodes under Fabric -> Node -> Edge Transport Nodes which are part of the Edge cluster. In my case , Edgenode01a and 02a is part of the edge cluster “EdgeCluster-01a”
You can also validate the same from the Edge cluster view. Expand fabric -> Edge Clusters -> Click cluster name. It will show the transport node (edge nodes) part of this edge cluster.
Before replacing the faulty edge with a new node, We have to place the faulty edge node “edgenode-02a” into NSX Maintenance Mode.
To place the NSX-T edge node into maintenance mode, Select the NSX-T Edge node -> Select “Enter NSX Maintenance Mode” under actions
Click YES to confirm to keep the edge node ” edgenode-02a” in NSX Maintenance Mode.
Faulty edge node edgenode-02a is entered into “NSX Maintenance Mode” and Node status is showing as “Down” as well.
To replace the faulty edge node. Go to the Edge cluster from System -> Fabric ->Edge Cluster -> Select the Edge cluster “EdgeCluster-01a”.
Select Replace Edge Cluster Member under Actions.
Select the faulty NSX-T edge node from the drop-down under the Replace option. In my case, select Edgenode-02a.
Select the newly deployed edge node from the drop-down under With Option. In my case, my newly deployed edge node is edgnode-03a and click Save.
After the edge node replacement, edge node edgnode-01a and edgenode-03a is part of the edge cluster “Edgecluster-01a” now and Tunnel is up and also Node status also became UP.
You can also validate the edge cluster members from the edge cluster view. Now edge cluster members are edgnode-01a and edgenode-03a.
Post Faulty Edge node replacement, you can also validate the traffic and Tier-0 gateway status. In my case, Everything turned be healthy. That’s it. We are done with replacing the faulty NSX-T edge node in the Edge cluster.
You can also watch the detailed step-by-step video on How to Replace the Faulty NSX-T Edge node in the NSX-T edge cluster from my YouTube channel.
I hope this is informative for you. Thanks for Reading!!. Be social and share it with social media, if you feel worth sharing it.