If you are facing any issues related to hosts in the HA cluster , I would recommend to follow the below basic 10 troubleshooting steps. Most of the time, This will resolve the issues.
Error message will be similar to the below one
1. Check your environment, if any temporary network problem exists
2. Check the DNS is configured properly
3. Check the vmware HA agent status in ESX host by using below commands
service vmware-aam status
4. Check the ESX networks are properly configured and named exactly as other hosts in the cluster. otherwise, you will get the below errors while installing or reconfiguring HA agent.
5. Check HA related ports are open in firewall to allow for the communication
Incoming port: TCP/UDP 8042-8045
Outgoing port: TCP/UDP 2050-2250
6. Try to restart /stop/start the vmware HA agent on the affected host using the below commands.
In addition, u can also try to restart vpxa and management agent in the Host.
service vmware-aam restart
service vmware-aam stop
service vmware-aam start
7. Right Click the affected host and click on “Reconfigure for VMWare HA” to re-install the HA agent that particular host.
8. Remove the affected host from the cluster. Removing ESX host from the cluster will not be allowed untill that host is put into maintenance mode.
9.Alternative solution for 8 step is, Goto cluster settings and uncheck the vmware HA to turnoff the HA in that cluster and re-enable the vmware HA to get the agent installed from the scratch.
10. For further troubleshooting , review the HA logs under /Var/log/vmware/aam directory.
Thanks For Reading!!!!!