Administer > Diagnostic tests > Core Health Check Monitor (HCM)

Core Health Check Monitor (HCM)

HCM provides two types of tests:

  • Local Tests: Validate the health of a core on a component-by-component basis.
  • Global Tests: Validate the health of a core on a holistic basis.

Overview of HCM local tests

  • The HCM local tests validate individual core components. The local tests reside on the same server as the components they validate. Run local tests by running the SA Start script (/etc/init.d/opsware-sas) and specifying a test mode argument and optional component names.

  • The test mode specifies the set of tests to run (you cannot specify individual tests.) Each test is run only once, even if you specify multiple components that require the same test. The test results are displayed on stdout.

  • You cannot run the Health Check Monitor from a Satellite host.

Syntax of the script for HCM local tests

HCM local tests use the following syntax:

/etc/init.d/opsware-sas <mode> [<component>[<component>...]]

[<name>=<value>[<name>=<value>]...]

Running HCM local tests

To run the local tests, perform the following steps:

  1. Log on as root to the server running the SA core components that you want to test.
  2. Run the SA start up script using the status argument or specify the mode (test category) argument and one or more components (see the next section for the command options). For example, the following verifies that the Web Services Data Access Engine is available.

    /etc/init.d/opsware-sas status twist

    The following table describes the HCM command-line arguments. For a description of the opsware-sas options for starting and stopping a core, see theOptions for the SA Start/Stop Script table.

    Options for the HCM Local Test Script

    Option

    Description

    mode

    The set of tests to run. The mode can be one of the following strings:

    status: Runs tests that verify the availability of the specified components. For example, the tests verify that the components are listening on the correct ports and responding to basic queries.

    verify_post: Same as status.

    verify_pre: Runs tests that validate the conditions necessary for the specified components to operate.

    verify_functionality: Runs tests that are similar to the tests run by the status mode; however, they might take longer to run. Therefore, you might choose to skip these tests to save time.

    health: Runs the tests of the status, verify_pre, and verify_functionality modes and provides an overview of the overall state of the specified components.

    component

    The internal name of the core component. If this option is not specified, then all components are validated. To view the internal names of the components installed on the local server, enter the following command:

    /etc/init.d/opsware-sas list

    name=value

    Options that control how the tests are run. Allowed values:

    terse=[true|false]: If true, summarizes the results of all successful tests for each component in a single SUCCESS message; however, the results of failed tests are displayed individually. By default, this option is set to false. (This option is passed to the individual tests.)

    parsable=[true|false]: If true, summarizes the results from all tests for each component with a single SUCCESS or FAILURE message. By default, this option is set to false. (This option is passed to the individual tests.)

    verify_filter=<regex>: Runs only the tests whose file names match the regular expression you enter. For example, specifying verify_filter="OPSW" runs only tests with file names that contain the string OPSW, such as 100_OPSWcheck_host_spin.sh. By default, this option is not defined. (This option is not passed to the individual tests.)

    If a given test is a symbolic link to another file, the filter will be evaluated against the target of the symbolic link, not the name of the symbolic link. If the test is a symbolic link, verify_filter uses the file name of the file it is pointing to for comparisons.

You can find a list of the internal name used for certain Core Components and their standard names in SA Core Component internal names.

Overview of HCM global tests

A global HCM test checks an entire SA Core. Run these tests by executing the run_all_probes.sh script on the following hosts:

  • Sliced configuration—the server hosting the core’s Management Gateway and/or Infrastructure Component (in a Typical Install, the Management Gateway is installed on the server that hosts the Infrastructure Component).
  • Non-sliced configuration—the server hosting the Primary Model repository Multimaster Component for the core being validated.

Test results are displayed on stdout. The global tests cannot check the status of other cores in a multimaster mesh.

In a multiserver core, the global tests connect to the other core servers using SSH. All connections are made as root, or a non-root user with sudo permission. Authentication is performed by specifying the user password or the key file on the command line. If both are specified, then the user password is used. One of these authentication methods must be specified unless the server is the local host. Passwordless login with keyfile is supported only for the root user.

Running HCM global tests

To run the HCM global tests, perform the following steps:

  1. Log in as root to the server that hosts the Model Repository Multimaster Component and/or the Infrastructure Component.
  2. Execute the run_all_probes.sh script with the run option (see the following section for details on the options). For example, to check the table space usage in the Oracle database of the Model Repository, enter the following command:

    /opt/opsware/oi_util/bin/run_all_probes.sh run \

    check_database_tables

Syntax of the script for HCM global tests

The script that runs HCM global tests has the following syntax:

/opt/opsware/oi_util/bin/run_all_probes.sh run|list

[<test> [<test>...]

[hosts="[<user>@]<system>[:<password>] [[<user@>]<system>[:<password>]]..."

[keyfile=<keyfiletype>:<keyfile>[:<passphrase>]]

The following table describes the options for this syntax.

Options for the HCM Global Test Script

Option

Description

list

Lists the available tests.

run

Runs the specified tests.

test

The name of the test to run. If no tests are specified, all tests are run. When shipped, the script includes the following tests:

check_opsware_services: Runs the local tests on all specified servers by running the following command remotely on each core server:
/etc/init.d/opsware-sas health

check_MM_state: For a multimaster source core, checks the multimaster state of the core.

check_time: In a multiserver core, verifies that the system clocks are synchronized across core servers.

check_opsware_version: Validates that the versions of all the components in the core are the same version.

check_database_tables: Validates that the Model Repository tablespace usage is within acceptable limits. For more information on table spaces, see the “Oracle Setup for Model Repository” section in the SA 10.50 Install Guide.

 

check_OS_resources: Validates whether the virtual memory and disk space on SA partitions is within acceptable thresholds.

check_fully_functional: Validates full functionality of all SA components. For an alternative way to run System Diagnostics Comprehensive tests from the SA Client, see System diagnostic tests.

system

Name of a reachable SA core system.

keyfiletype

Specifies the type of key file to use. Allowed values are:

rsa_key_file

dsa_key_file.

keyfile

Specifies the file containing the current server’s SSH private key. Passwordless login with keyfile is supported only for the root user.

passphrase

Specifies the passphrase that was used to encrypt the SSH private key.

user Optional user to access the remote system. The user needs to have sudo permission. Default is "root".
password Optional <user> password for <system>.

Setting up passwordless SSH for global tests

The global tests access remote servers in a core through the SSH daemon. These tests require you to supply user passwords or to use SSH public/private keys. Passwordless login with keyfile is supported only for the root user.

To set up authentication using public/private keys generated by ssh-keygen, perform the following steps:

  1. Run the following commands on the trusted server and accept the defaults. The commands are different for Linux and Solaris.

    Linux:

    cd /root/.ssh

    ssh-keygen -t dsa

    Solaris:

    cd /.ssh

    ssh-keygen -t dsa

  2. Update the client server by copying the id_dsa.pub file to the client server’s .ssh directory and then renaming it to authorized_keys. Here are some example commands for Linux and Solaris:

    Linux:

    scp id_dsa.pub <host>:/.ssh/authorized_keys

    /root/.ssh/authorized_keys

    Solaris:

    scp id_dsa.pub <host>:/.ssh/authorized_keys

    /.ssh/authorized_keys

  3. Verify the trusted server. Run the following command to validate that the trusted server can connect to the client server without a password:

    ssh -l root <host>