Recertify SA Cores

You must clear all backlogs and conflicts on your Multimaster Mesh before you start a core recertification.

Some recertification phases will be performed automatically, while others require multiple runs of the corerecert tool.

To recertify SA Cores:

  1. Ensure that you are classified as a Core Recertification User. If not, see your SA System Administrator.
  2. Log on to an SA Core host.
  3. Change directory to /opt/opsware/oi_util/OpswareCertTool/recert_utils/.
  4. Edit:

    corerecert.conf

    to ensure that the information is correct for your environment.

  5. Run:

    corerecert --status

    to ensure Core Recertification is not currently in progress.

  6. Run:

    discover_mesh -p

    to make sure the Core Recertification Tool can correctly detect your Mesh setup.

  7. Run:

    corerecert --phase 1

    from the command line to initialize Core Recertification.

    Mesh-wide gateways will be automatically restarted.

  8. Monitor the progress on screen by running:

    corerecert --status

    until it has indicated Phase 1 is completed.

  9. Run:

    corerecert --phase 4

    from the command line to start Phase 4, which appends a new Agent CA to all the Agents.

    Note If the agent_recert.using_cdr parameter value is 0 in the corerecert.conf file, the run phase 4 process is skipped.

    Important Skipping Phase 4 will lead to SWR communication tests failures and Phase 8 will not recertify the Agent.

  10. Monitor the progress on screen by running:

    corerecert --status

    until all the Agents have successfully had a new Agent CA appended.

    Note
    This step could take days depending on your maintenance windows and the Agent availability. There can be only one scheduled or active Agent Recertification job per facility at any given time. If you encounter any errors during this stage, resolve the errors and go back to step 9. You only need to reschedule the facilities that had errors. You do not need to reschedule the Agent Recert job for the successful facilities.

  1. Run:

    corerecert --phase 6 --doit

    from the command line to start Phase 6 of the core recertification.

    Note The job for Phase 6 will get executed only if the recertification session is in Phase 1 or 4 with status COMPLETED. Otherwise, the job will log a warning message and exit without taking any action. You can force the execution of Phase 6 by using the --doit flag. However, skipping Phase 4 will lead to SWR communications tests failures and Phase 8 will not recertify the Agent.

  2. Monitor the progress on screen by running:

    corerecert --status

    until it has indicated mesh_restart_pending.

    At this point, you must restart the mesh, using the mesh restart instructions and sequences in the SA Administer section, SA Maintenance section.

    Note
    This step could take days depending on your maintenance window. If you encounter any errors during this stage, make sure you resolve the errors and go back to step 11.

  3. After manually restarting the mesh successfully, run:

    corerecert --phase 6

    from the command line to continue phase 6.

    In this step, SA performs two functions:

    • Checks to see if the restart took place on the cores.
    • Automatically restarts the primary-spin component of each SA facility.
  4. Monitor the progress on screen by running:

    corerecert --status

    until it indicates that Phase 6 is completed. If you encounter errors, resolve them and go back to step 13.

  5. Run:

    corerecert --phase 7

    from the command line to start phase 7.

    Mesh-wide gateways will be automatically restarted.

  6. Monitor the progress on screen by running:

    corerecert --status

    until it indicates that Phase 7 is completed. If you encounter errors, resolve them and go back to step 15.

  7. Run:

    corerecert --phase 8

    from the command line to start Phase 8, which recertifies all the Agents.

  8. Monitor the progress on screen by running:

    corerecert --status

    until all Agents have successfully been recertified.

    Note
    This step could take days depending on a customer’s maintenance windows and the agent availability. There can be only one scheduled or active Agent Recertification job per facility at any given time. If you encounter any errors, resolve them and go back to step 17. You only need to reschedule the facilities that had errors, not the Agent Recertification job for the successful facilities.

  9. Run:

    corerecert --phase 9 --doit

    from the command line to start phase 9.

    Note The job for Phase 9 will get executed only if the recertification session is in Phase 8 with status COMPLETED. Otherwise, the job will log a warning message and exit without taking any action. You can force the execution of Phase 9 by using the --doit flag. However, if Phase 8 is not completed, some Agents might not have been recertified. These Agents will not be able to communicate with the Core server once the old Agent CA is no longer trusted.

  10. Monitor the progress on screen by running:

    corerecert --status

    until it has indicated mesh_restart_pending. If you encounter any errors during this stage, make sure you resolve the errors and go back to step 19.

    At this point, ensure that there are no conflicts and no transaction backlogs in the mesh.

  1. On the base Slice core server:
    1. Issue the following commands:

      touch /var/opt/opsware/crypto/twist/upgradeInProgress

      /etc/init.d/opsware-sas restart

    2. Wait till the restart finishes successfully.
  2. At this point, work with your SA System Administrator to restart the rest of the mesh. This step could take days depending on your maintenance window. If you encounter any errors, resolve them and go back to step 19.

  3. After the mesh has been successfully restarted, the Recertification User must run:

    corerecert --phase 9

    from the command line to continue phase 9.

    SA checks to see if the restart took place on the cores.

  4. Monitor the progress on screen by running:

    corerecert --status

    until it indicates that Phase 9 is completed. If you encounter any errors, resolve them and go back to step 22.

  5. On the base slice core server:
    1. Issue the following command:

      touch /opt/opsware/oi_util/OpswareCertTool/recert_utils/TruthResignStatus.txt /opt/opsware/oi_util/OpswareCertTool/recert_utils/WordResignStatus.txt

    2. Run phase 11:

      corerecert ––phase 11

      from the command line to start Phase 11 which resigns data in model repository, software repository, recurring jobs and audit streams.

  6. Monitor the progress on screen by running:

    corerecert --status

    until it indicates that Phase 11 is completed. If you encounter any errors, resolve them and go back to step 24b.

  7. If you do not intend to remove the Agent CA, skip to step 28. Otherwise, run:

    corerecert --phase 12

    from the command line to start phase 12, which removes the old Agent CA from all the Agents.

  8. Monitor the progress on screen by running:

    corerecert --status

    until the old Agent CA has removed from all the Agents.

    Note
    This step could take days depending on customer’s maintenance windows and the agent availability. If you encounter any errors during this stage, resolve the errors and go back to step 26. You only need to reschedule the facilities that had errors. You do not need to reschedule the Agent Recertification job for the successful facilities.

    Note
    For Core Recert with custom certificate, HPE recommends that you go through phase 13 so the old Core-component certificate is removed from the trusted CA store, and, therefore, only the customer certificate chain is used for verification.

  1. Run:

    corerecert --phase 13

    from the command line to start phase 13.

  2. Monitor the progress on screen by running:

    corerecert --status

    until it indicates mesh_restart_pend or core_recert_completed.

    Continue with the remaining instructions only if the status is mesh_restart_pending.

    Note
    At this point, you must restart the mesh, using the mesh restart instructions and sequences in the SA Administer section, SA Maintenance section.

    Note
    This step could take days depending on the customer’s maintenance window. If you encounter any errors during this stage, resolve the errors and go back to step 28.

  3. After the mesh has been successfully restarted, run:

    corerecert --phase 13

    from the command line to continue phase 13.

  4. Monitor the progress on screen by running:

    corerecert --status

    until it indicates that Core Recertification has completed successfully.