Reconfigure vSphere HA Host Operation Timed Out

I think it is worth sharing what I learned Today. We found that HA is not functioning on all hosts in a vSphere cluster. Reconfiguring the vSphere HA host always failed with ‘Operation timed out’ error message. I checked the fdm.log on couples servers, and found them all have such message:

error ‘Election’ opID=SWI-cb1a0483] ReadMsg: [120 times] Wrong fault domain ID: 9148BCE8-A6E7-45D7-B591-76C15A3F6470-26-9e10b65-my-vCenter!= 9148BCE8-A6E7-45D7-B591-76C15A3F6470-26-8284ae7-my-vCenter from 192.168.1.102

My understanding of this message is that the Master election process failed due to the different fault domain between local host and the remote host (192.168.1.102). The weird thing is 192.168.1.102 is a host that has been placed into maintenance mode. So I guess it could be caused by that the 192.168.1.102 was the master in the fault domain, but somehow it failed to tell other hosts while it quit from the domain and enter into the maintenance mode. To approve my guess, I pull the host back from maintenance mode, all red alarm of the HA failure disappeared right away!!

I checked the HA status, all look good. A new master has been elected successfully. Then I place the 192.168.1.102 into maintenance again, HA on all other hosts still functioning well.

	Levon Ritter on AWS DataSync vs S3 Sync
	Joe on AWS Bedrock AgentCore: Enterpr…
	ABDUL YASEEN BABA MO… on TSM
	Heather W on Puppet push Nagios
	Umesh Kumar on Yum gets ‘HTTPS Error 40…
	Pavel on Check Confluence team calendar…
	withanHdammit on Renew AWS credential for a lon…
	Unleashing the Power… on Image-Reader: A project to exp…
	Bob on Build docker image with kaniko…
	Voces De La Tierra on Puppet for Windows: Remote…

Reconfigure vSphere HA Host Operation Timed Out

Published by Jackie Chen

Leave a comment Cancel reply

Share this:

Related

Published by Jackie Chen

Leave a comment Cancel reply