Common High Availability Configuration Mistakes

Some common High Availability (HA) configuration mistakes are listed here:

  • Incorrect disk configuration

    • VCS: If a resource cannot be probed, the configuration is somehow wrong. If a disk cannot be probed, the disk might no longer be accessible by the operating system.
    • Test the disk configuration manually and confirm against HA documentation that the configuration is appropriate.
  • The disk is in use and cannot be started for the HA resource group.

    Always check that the disk is not activated before starting the HA resource group.

  • WSFC: Bad network configuration

    If network traffic is flowing across multiple NIC cards, RDP sessions fail when activating programs that consume a large amount of network bandwidth, such as the NNMi ovjboss process.

  • Some HA products do not automatically restart at boot time.

    Review the HA product documentation for information about how to configure automatic restart on boot up.

  • Adding NFS or other access to the OS directly (resource group configuration should be managing this).
  • Being in the shared disk mount point during a failover or offlining of the HA resource group.

    HA kills any processes that prevent the shared disk from being unmounted.

  • Reusing the HA cluster virtual IP address as the HA resource virtual IP address (works on one system and not the other)
  • Timeouts are too short. If the products are misbehaving, HA product might time out the HA resource and cause a failover.

    WSFC: In Failover Cluster Management, check the value of the Time to wait for resource to start setting. NNMi sets this value to 15 minutes. You can increase the value.

  • Not using maintenance mode

    Maintenance mode was created for debugging HA failures. If you attempt to bring a resource group online on a system, and it fails over shortly afterwards, use the maintenance mode to keep the resource group online to see what is failing.

  • Not reviewing cluster logs (cluster logs can show many common mistakes).