Use > Set up Infrastructure Aspects > Systems Infrastructure Aspects

Systems Infrastructure Aspects

Systems Infrastructure Aspects manage the health of every single system in the environment. Each system will have its own set of resources, hardware and software that needs to be managed for the system to be healthy. It also monitors the performance of all system resources such as CPU, Memory, Disk, FileSystem, Network Interface, System process and services, Security, System logging and so on. Systems Infrastructure Aspects monitors the Computer CI types.

User Interface Reference

General

Provides an overview of the general attributes of the Systems Infrastructure Aspects.

CI Type The type of CIs that can be assigned to the Aspect. This is the type of CI which is assigned to the Management Template. The Systems Infrastructure Aspects contain the Computer CI types.
Instrumentation Provides an overview of the programs deployed to the CI types which contains the System Infrastructure Aspect.
Aspects

Provides an overview of any Aspects that contain the Systems Infrastructure Aspect. You can expand each item in the list to see more details about the nested Aspect.

Policy Templates

Provides an overview of the policy templates that contain the Systems Infrastructure Aspect. You can expand each item in the list to see more details about the policy template.

The Systems Infrastructure Aspects consists of the following:

Adaptive Thresholding

Adaptive Thresholding Aspect is compatible with Operations Agent 12.00 or later. Baseline data computed by Operations Agent is used by the Sys_AdaptiveThresholdingMonitor policy to monitor performance and resource utilization. The Aspect contains a ConfigFile and a Measurement Threshold policies. You must tune the policies to specify the required baseline metric and deviation.

Prerequisite for Deploying

Perform following tasks before deploying the Aspect:

  1. Enable baselining in Operations Agent.

    By default, baselining is disabled on the Operations Agent node. You can enable baselining either through command line or configuration file.

    To enable base using the configuration file, follow the steps:

    1. Open the Policy Template pane:

      Administration > Monitoring > Policy Template.

    2. In the Policy Template Groups pane, expand Policy Management > Templates grouped by type > Configuration > Node Info.
    3. In the Policy Templates pane, select the OPC_PERL_INCLUDE_INSTR_DIR policy and click . The Edit Node Info Policy window opens.
    4. In the Policy Data pane, at the end add the following lines:

      [oacore]
      ENABLE_BASELINE=TRUE
    5. Click Save and Close.

    Baseline data is computed only at the end of every hour. If you want the baseline data to be computed immediately after you enable baseline, you must restart the oacore process.

    Run the following command to restart oacore:

    ovc -restart oacore
  2. Define metrics to create baseline metrics.

    Based on the classes defined in the baseline.cfg file, corresponding baseline classes are created in the Metrics Datastore. For every metric specified in the baseline.cfg file, sixteen baseline metrics are created. The original baseline.cfg file is available at %OvDataDir%/conf/sispi/configuration location. The baseline.cfg file is overwritten after the ConfigFile policy is deployed.

  3. Decide where the deviation is required.

    You can configure deviations (N) either in the Sys_ConfigureBaselining policy or in the Sys_AdaptiveThresholdingMonitor policy.

    • To configure deviations for specific metrics, set the deviations in the Sys_ConfigureBaselining policy.
    • To configure deviations for all metrics, set the deviations in the Script-Parameters tab in the Sys_AdaptiveThresholdingMonitor policy.
    • If no deviations are set for a metrics in the Sys_ConfigureBaselining policy, then the deviations set in the Sys_AdaptiveThresholdingMonitor policy are used to calculate adaptive threshold values.
  4. Edit Sys_ConfigureBaselining ConfigFile policy to specify single deviation.

    1. Open the Management Templates & Aspects pane:

      Click Administration > Monitoring > Management Templates & Aspects.

    2. In the Configuration pane, expand Configuration Folders > Infrastructure Management > System Infrastructure Aspects.
    3. In the Management Templates & Aspects pane, select the Adaptive Thresholding Aspect and click . The Edit Aspect window opens.
    4. In the Policy Templates tab, double-click Sys_ConfigureBaselining policy. The Edit ConfigFile Policy window opens.
    5. In the Policy Data tab, specify class, metric and deviation in the following format.

      <Class>:<Metric>,<Warning Deviation>,<Minor Deviation>,<Major Deviation>,<Minimum Value>,<Maximum Value>,<CutOff>

      In this instance:

      • <Class> is the metrics class.
      • <Metric> is the metrics for which baseline data must be computed.

      Only class and metric are mandatory. You can define deviation for specific instance for instance based monitoring.

      For Example,

      dsk0,0.1,0.2,0.3,0,100,20
      dsk1,0.2,0.3,0.4,0,100,15
      dsk2:exclude

      For every group, one corresponding CFG file is created. The CFG files are automatically deleted once the instance metric are delete in the policy.

    6. Click Save and Close.
  5. Edit Sys_AdaptiveThresholdingMonitor policy for defining deviation for each metric.

    1. Open the Management Templates & Aspects pane:

      Click Administration > Monitoring > Management Templates & Aspects.

    2. In the Configuration pane, expand Configuration Folders > Infrastructure Management > System Infrastructure Aspects.
    3. In the Management Templates & Aspects pane, select the Adaptive Thresholding Aspect and click . The Edit Aspect window opens.
    4. In the Policy Templates tab, double-click Sys_AdaptiveThresholdingMonitor policy. The Edit ConfigFile Policy window opens.
    5. In the Policy Data tab, you can modify the default parameters in the Policy Parameters tab.

    6. Click Save and Close.
  6. Include the latest version of policies in Aspect and deploy.

Adaptive Thresholding Aspect Detail

This Aspect monitors the Operations Agent baseline metrics.

CI Type Policy Template Indicator Description Policy Type
host_node Sys_ConfigureBaselining NA Config policy to configure baseline metrics ConfigFile
host_node Sys_AdaptiveThresholdingMonitor NA Adaptive thresholding policy Measurement Threshold

Bandwidth Utilization and Network IOPS

This Aspect monitors IO operations, and performance of the systems in the network. It monitors the network IO operations and performance based on the bandwidth used, outbound queue length and average bytes transferred per second. This Aspect contains the following policy templates:

CI Type Policy Template Indicator Description Policy Type
Computer Sys_PerNetifOutbyteBaseline-AT NA Multi-Instance Baseline for NETIF Outbyte Measurement Threshold
Computer Sys_PerNetifInbyteBaseline-AT NA Multi-Instance Baseline for NETIF Inbyte Measurement Threshold
Computer Sys_NetworkUsageAndPerformance NA

Monitors network card usage (data in/out)

Measurement Threshold

CPU Performance

This Aspect monitors the overall CPU performance such as the CPU utilization percentage and spike in CPU usage. Individual CPU performance monitoring is based on total CPU utilization, CPU utilization in user mode, CPU utilization in system mode and interrupt rate. This Aspect contains the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
host_node Sys_CPUSpikeCheck CPUUsageLevel Monitors the variation in processor performance. A system experiences CPU spike when there is a sharp rise in the CPU usage immediately followed by a decrease in usage. It also monitors the CPU time spent in user mode and system mode. It also monitors the total CPU time when the CPU is busy. Measurement Threshold Template
host_node Sys_GlobalCPUUtilization-AT NA Monitors the global CPU utilization Baseline. Measurement Threshold
host_node Sys_RunQueueLengthMonitor-AT NA Monitors the Run Queue Length Baseline. Measurement Threshold
host_node Sys_PerCPUUtilization-AT NA Monitors the multi-instance Baseline for CPU. Measurement Threshold

Change Configuration Monitor

Change Configuration Monitor Aspect monitor files, Windows registry settings, and command outputs for changes.

On deploying the Change Configuration Monitor Aspect, a ccilist.cfg file is created in the <OvDataDir>/ccimon/configuration folder.

The Sys_ChangeConfigurationMonitor policy reads the ccilist.cfg file to monitor the following changes on the system:

  • Software installed, removed or modified
  • Patches/service packs/updates installed
  • Changes to Kernel parameters
  • Boot configuration
  • Registry key for Windows only
  • Kernel image file
  • All user accounts
  • System service configuration,
  • Shared directories, NFS or CIFS (samba) mounts added, modified or removed
  • System environment variables

CCI and Desired State Monitoring Comparison Method

In both CCI and desired State monitoring method; files, Windows registry settings, and command outputs are monitored for changes. Comparison method in both the process is as following:

CCI Monitoring

During the first polling interval, backup of all the files specified in the ccilist.cfg file. From the consecutive polling, a comparison is performed between current version and backup version of files. Alerts are generated if modification is identified. The backup files are overwritten with fresh backup. Comparison is always between current version and recent backup version of files.

Desired State Monitoring

A gold file (with extension .gold) must be created for every single file that must be monitored and must be available in the same directory as the file. A gold file is a backup or reference file that remains unchanged.

For example, let us consider that you want to monitor the mtab file located in the /etc directory. Take a backup of this file and save it as mtab.gold in the /etc directory. This is your reference file or gold file which does not change. To monitor the mtab file add the following to the configuration file:

/etc/mtab==/etc/mtab.gold,file,Os,,major.

After deploying the Aspect, a check is performed to verify if desired state monitoring is defined in the ccilist.cfg configuration file. A comparison is performed for the files, windows registry settings, and command outputs specified in the configuration file with the corresponding gold file. Alerts are generated whenever there is a difference is identified between the two files.

Make sure that you define Desired State Monitoring in ConfigFile only after creating the gold file.

You must tune one of the following policies based on the Operating System of the node to specify CCI or desired state monitoring:

  • Sys_MSWindowsChangeConfig
  • Sys_SunSolarisChangeConfig
  • Sys_LinuxChangeConfig
  • Sys_HPUXChangeConfig
  • Sys_AIXChangeConfig

Tuning ConfigFile Policy

Edit and configure ConfigFile policy for defining the change monitoring.

  1. Open the Management Templates & Aspects pane:

    Click Administration > Monitoring > Management Templates & Aspects.

  2. In the Configuration pane, expand Configuration Folders > Infrastructure Management > System Infrastructure Aspects.
  3. In the Management Templates & Aspects pane, select the Change Configuration Monitor Aspect and click . The Edit Aspect window opens.
  4. In the Policy Templates tab, double-click the required policy. The Edit ConfigFile Policy window opens.
  5. In the Policy Data tab, specify class, metric and deviation in the following format.

    change ci key,cci type (file|cmd),msg group,backup filename,alert severity

    In this instance:

    • <change ci key> - Specifies a registry key, a command or a file name with complete path.

    • <cci type> - Set this to the following values - cmd, regkey, or file based on the change ci key. Registry key (regkey) type is available only for Windows managed nodes.
    • <msg group> - Specifies the OMi message group setting for the change alert. The default message group is Misc.

    • <backup filename> - This is the name with which a backup file is created in the backup folder. The backup file created is used for comparisons with the parent file (provide empty value for monitoring CCI type 'file'). (not required for cci type file) Backup folder is located in the<OvDataDir>/tmp file.

    • <alert severity> - Specifies the OMi alert severity setting. The default alert severity is Warning.

  6. Click Save and Close.

Change Configuration Monitor Aspect Detail

Aspect monitors the system configuration related changes.

CI Type Policy Template Indicator Description Policy Type
host_node Sys_MSWindowsChangeConfig NA Change Configuration Monitor Config file policy for MSWindows managed nodes. ConfigFile
host_node Sys_SunSolarisChangeConfig NA Change Configuration Monitor Config file policy for SunSolaris managed nodes. ConfigFile
host_node Sys_LinuxChangeConfig NA Change Configuration Monitor Config file policy for AIX managed nodes. ConfigFile
host_node Sys_ChangeConfigurationMonitor NA Change CI Monitor - Alerts for system configuration changes. Measurement Threshold
host_node Sys_HPUXChangeConfig NA Change Configuration Monitor Config file policy for HPUX managed nodes. ConfigFile
host_node Sys_AIXChangeConfig NA Change Configuration Monitor Config file policy for AIX managed nodes. ConfigFile

Examples for CCI and Desired State Monitoring

CCI Monitoring

Scenario Syntax in ConfigFile
To monitor the hosts file on Windows and send warning alerts with misc message group
c:\Windows\System32\drivers\etc\hosts,file,misc,,warning
To monitor the sys-temp folder on Windows for any changes
dir "%temp%"| findstr /V bytes,cmd,OS,dirtmpbin,warning
To monitor a registry key and its values on Windows HKEY_LOCAL_MACHINE\SOFTWARE\CCIMon,regkey,misc,temp,warning
On Windows, to monitor if opcmona.exe process is running on a node and if it is different from the last run wmic process where name='opcmona.exe' get processid,cmd,OS,notepadproc, major,unicode
To monitor if there are any new files or other changes in /tmp folder on Linux ls -1 /tmp | sort -u,cmd,Misc,ls1tmp.txt,warning
To monitor if there are any user changes on UNIX/Linux /etc/passwd,file,Security,,warning
To check for new filesystems mounted on UNIX/Linux /etc/mtab,file,OS,,minor

Desired State Monitoring

Scenario Syntax in ConfigFile
To monitor the hosts file on Windows and send warning alerts to a miscellaneous message group

Syntax: filename==reference file name,ccitype,msg group,[backup filename],alert severity,charset

Example: /etc/mtab==/etc/mtab.gold,file,misc,,warning

To monitor a folder on Windows for any changes, use the command type cmd for change tracking

Syntax: command==Path of the file containing command output,ccitype,msg group,[backup filename],severity

Example: ls /==/root/list.txt,cmd,Misc,,major

To monitor a registry key and its values on Windows

Syntax: Registry key=='value of registry key',ccitype,msg group,[backup filename],severity

Example: HKEY_LOCAL_MACHINE\SOFTWARE\config==config,regkey,misc,,warning

General System Services Availability

This Aspect monitors the availability of system services and processes. This Aspect monitors the following system services and processes:

  • HP-UX: Bootpd, Cron, and Network File System (NFS)
  • Linux: Dynamic Host Configuration Protocol (DHCP), Named, NFS, Sendmail, Cron, and Server Message Block (Smb)
  • Windows: Distributed File System (DFS), DHCP, Domain Name system (DNS), File Transfer Protocol (FTP), Firewall, Fax, NFS, Remote Procedure Call (RPC), RRA, Print, Simple Network Management Protocol (SNMP), Terminal server, Web Management Tools, and Web Server Service.
  • AIX: Cron, DHCP, Named, NFS, Portmap, Sendmail, and Webserver
  • Solaris: DHCP, Named, NFS, Sendmail, Cron, and SNMP
  • Debian: Apache, Cron, Exim, Internet Service Daemon (InetD), Named, Nfs, NetBIOS Message Block Daemon (Nmbd), Samba, and Single Sided High Density (Sshd).

This Aspect consists of the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
Computer Sys_AIXCronProcessMonitor BatchJobService Monitors the Cron daemon processes running on the AIX operating systems. Service/Process Monitoring Template
Computer Sys_AIXDHCPProcessMonitor DHCPServerService Monitors the DHCP server daemon processes running on AIX operating systems. Service/Process Monitoring Template
Computer Sys_AIXNamedProcessMonitor DNSService Monitors the Named processes running on the AIX operating systems. Service/Process Monitoring Template
Computer Sys_AIXNfsServerProcessMonitor FileServerService Monitors the NFS server related processes running on the AIX operating systems. Service/Process Monitoring Template
Computer Sys_AIXPortmapProcessMonitor RPCService Converts the RPC program numbers into Internet port numbers running on the AIX operating systems. Service/Process Monitoring Template
Computer Sys_AIXQdaemonProcessMonitor PrintService Monitors the job requests and the resources required to complete the jobs running on AIX operating systems. Service/Process Monitoring Template
Computer Sys_AIXSendmailProcessMonitor EmailService Monitors the Sendmail daemon processes running on the AIX operating systems. Service/Process Monitoring Template
Computer Sys_AIXWebserverProcessMonitor WebServerService Monitors the httpd daemon processes running on the AIX operating systems. Service/Process Monitoring Template
Computer Sys_HPUXBootpdProcessMonitor DHCPServerService Monitors the Bootpd daemon processes running on the HP-UX operating systems. Service/Process Monitoring Template
Computer Sys_HPUXCronProcessMonitor BatchJobService Monitors the Cron daemon processes on the HP-UX operating systems. Service/Process Monitoring Template
Computer Sys_HPUXNfsServerProcessMonitor FileServerService Monitors the state of NFS daemon processes running on the HP-UX operating systems. Service/Process Monitoring Template
Computer Sys_LinuxDHCPProcessMonitor DHCPServerService Monitors the DHCP daemon processes running on Linux operating systems. Service/Process Monitoring Template
Computer Sys_LinuxNamedProcessMonitor DNSService Monitors the Named daemon processes running on Linux operating systems. Service/Process Monitoring Template
Computer Sys_LinuxNfsServerProcessMonitor FileServerService Monitors the state of NFS daemon processes running on Linux operating systems. Service/Process Monitoring Template
Computer Sys_LinuxSendmailProcessMonitor EmailService Monitors the Sendmail daemon processes running on the Linux operating systems. Service/Process Monitoring Template
Computer Sys_LinuxSmbServerProcessMonitor FileServerService Monitors SMB daemon processes running on the Linux operating systems. Service/Process Monitoring Template
Computer Sys_MSWindowsDFSRoleMonitor NA Monitors the availability of system services required for the DFS role service. Service/Process Monitoring Template
Computer Sys_MSWindowsDHCPServerRoleMonitor DHCPServerService Monitors the availability of system services required for the DHCP server role service. Service/Process Monitoring Template
Computer Sys_MSWindowsDNSServerRoleMonitor DNSService Monitors the availability of system services required for the DNS server role service. Service/Process Monitoring Template
Computer Sys_MSWindowsFTPServiceRoleMonitor FTPService Monitors the availability of system services required for the FTP publishing service role service. Service/Process Monitoring Template
Computer Sys_MSWindowsFaxServerRoleMonitor NA Monitors the availability of system services required for the fax server role service. Service/Process Monitoring Template
Computer Sys_MSWindowsFirewallRoleMonitor FirewallService Monitors the availability of system services required for the Windows firewall. Service/Process Monitoring Template
Computer Sys_MSWindowsNFSRoleMonitor NA Monitors the availability of system services required for the NFS role service. Service/Process Monitoring Template
Computer Sys_MSWindowsPrintServiceRoleMonitor PrintService Monitors the availability of system services required for print services role service. Service/Process Monitoring Template
Computer Sys_MSWindowsRRAServicesRoleMonitor NA Monitors the availability of system services required for routing and remote access services role service. Service/Process Monitoring Template
Computer Sys_MSWindowsRpcRoleMonitor RPCService Monitors the availability of system services required for RPC. Service/Process Monitoring Template
Computer Sys_MSWindowsSnmpProcessMonitor NA Monitors the SNMP service on Windows operating systems. Service/Process Monitoring Template
Computer Sys_MSWindowsTSGatewayRoleMonitor NA Monitors the availability of system services required for Terminal Services (TS) gateway role service. Service/Process Monitoring Template
Computer Sys_MSWindowsTSLicensingRoleMonitor NA Monitors the availability of system services required for TS licensing role service. Service/Process Monitoring Template
Computer Sys_MSWindowsTSWebAccessRoleMonitor NA Monitors the availability of system services required for TS web access role service. Service/Process Monitoring Template
Computer Sys_MSWindowsTerminalServerRoleMonitor MSTerminalService Monitors the availability of system services required for terminal server role service. Service/Process Monitoring Template
Computer Sys_MSWindowsWebMgmtToolsRoleMonitor NA Monitors the availability of system services required for web management tools role service. Service/Process Monitoring Template
Computer Sys_MSWindowsWebServerRoleMonitor WebServerService Monitors the availability of system services required for web server role service. Service/Process Monitoring Template
Computer Sys_OpenSshdProcessMonitor SecureLoginService Monitors SSH daemon processes running on the system Service/Process Monitoring Template
Computer Sys_RHELCronProcessMonitor BatchJobService Monitors Cron daemon processes running on RHEL operating systems. Service/Process Monitoring Template
Computer Sys_SLESCronProcessMonitor BatchJobService Monitors Cron daemon processes running on SLES operating systems. Service/Process Monitoring Template
Computer Sys_SunSolarisCronProcessMonitor BatchJobService Monitors Cron daemon processes running on Sun Solaris operating systems. Service/Process Monitoring Template
Computer Sys_SunSolarisDHCPProcessMonitor DHCPServerService Monitors DHCP daemon processes running on Sun Solaris operating systems. Service/Process Monitoring Template
Computer Sys_SunSolarisNamedProcessMonitor DNSService Monitors Named daemon processes running on Sun Solaris operating systems. Service/Process Monitoring Template
Computer Sys_SunSolarisNfsProcessMonitor FileServerService Monitors NFS processes running on Sun Solaris operating systems. Service/Process Monitoring Template
Computer Sys_SunSolarisSendmailProcessMonitor EmailService Monitors the Sendmail daemon processes running on Sun Solaris operating systems. Service/Process Monitoring Template
Computer Sys_UnixSnmpdProcessMonitor NA Monitors the SNMP processes running on Linux and Unix operating systems. Service/Process Monitoring Template
Computer Sys_DebianApacheProcessMonitor WebServerService Monitors the Apache processes running on Debian operating systems. Service/Process Monitoring Template
Computer Sys_DebianCronProcessMonitor BatchJobService Monitors the Cron daemon processes running on Debian operating systems. Service/Process Monitoring Template
Computer Sys_DebianEximProcessMonitor EmailService Monitors Exim processes running on the Debian operating systems. Service/Process Monitoring Template
Computer Sys_DebianInetdProcessMonitor NA Monitors the Inetd processes running on the Debian operating systems. Service/Process Monitoring Template
Computer Sys_DebianNamedProcessMonitor DNSService Monitors the Named processes running on Debian operating systems. Service/Process Monitoring Template
Computer Sys_DebianNfsServerProcessMonitor FileServerService Monitors the Nfs processes running on Debian operating systems. Service/Process Monitoring Template
Computer Sys_DebianNmbdProcessMonitor FileServerService Monitors the Nmbd processes running on Debian operating systems. Service/Process Monitoring Template
Computer Sys_DebianSambaProcessMonitor FileServerService Monitors the Samba processes running on Debian operating systems. Service/Process Monitoring Template
Computer Sys_DebianSshdProcessMonitor NA Monitors the SSH daemon processes running on Debian operating systems. Service/Process Monitoring Template

Key System Services Availability

The Aspect monitors the key processes that run in the background to support the different tasks required for the operating system or application.

The Sys_ProcessMonitor policy monitors all the process in the process groups. In the Sys_ProcessMonitorConfig policy, you must specify the process group and location of the procmon.cfg configuration file. You can specify the process group and location of the procmon.cfg configuration file while deploying the Aspect as optional parameters. Alerts are generated whenever the processes defined in the configuration file either do not run as expected or the processes are out of limits during the specified time of the day and day of the week.

After the Aspect is deployed, if the procmon.cfg file is available in the location specified in Sys_ProcessMonitorConfig policy, then the file is overwritten. If the file is not available, then a new file is created in the designated location.

Prerequisites

Following is a syntax for editing Sys_ProcessMonitorand Sys_ProcessMonitorConfig policies before deploying.

Syntax for ConfigFile Policy

[ProcessGroupName]
Process name<tab>Argument<tab>Time of the day<tab>Days of Week<tab>Bounds

In this instance,

  • Process Name: Specifies the name of the process to be monitored.
  • Arguments: Specifies the arguments that are used to distinguish between multiple processes running simultaneously. If no arguments are present, an asterisk (*) must be specified.
  • Time of the day: Specifies the time duration (in the 24-hour format) during which a process failure must be reported.
  • Day of Week: Specifies the day or days to report process failure. Each day of week is identified with a number as listed in the table. 0 being Sunday and 6 being Saturday. Numbers should be separated by commas.
  • Bounds: Specifies the number of instances of named processes. You can specify the number of instance as follows:

    n: An exact number

    n: A minimum of n

    -n: A maximum of n

    m-n: A range of m to n

  • @ severity: Specifies the severity of alert messages such as Minor, Major or Critical. Default severity is Warning.
  • @Start: Specifies the command (<cmd>) that must be run during a process failure.

Scenarios and syntax for customizing ConfigFile policy:

Scenario Sample Syntax
To monitor processes all week during every polling interval
[ProcessGroupName]
processname<tab>Argument<tab>ProcessLimits
/opt/OV/lbin/eaagt/opcmona	*	1
To monitor processes with more than one argument

Syntax: pr3ocessname<tab>Argument1<space>Argument2<tab>ProcessLimits

Arguments should be separated with space.

Example:

[OMi_Mgt]
/opt/OV/bin/oacore	oacore /var/opt/OV/conf/oa/PipeDefinitions/oacore.xml	1
To monitor processes only at specified time of the day or day of the week

Syntax:

[ProcessGroupName]
processname<tab>Argument<tab>Timeoftheday<tab>Daysofweek<tab>ProcessLimits

Example:

[OMi_Mgt]
/opt/OV/lbin/agtrep/agtrep	-start	5-23	0,1,2,3	1
To mention the severity of the alert message

Syntax:@severity=<severity>

severity can be warning,minor,major,critical

Example: @severity=minor

To mention operator initiated command

Syntax: @start=<command to be executed>

Example: @start=ovc -start agtrep

Key System Services Availability Aspect Detail

This Aspect consists of the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
Computer Sys_AIXSyslogProcessMonitor NA This policy template monitors the Syslog processes running on AIX operating systems. Service/Process Monitoring Template
Computer Sys_HPUXSshdProcessMonitor NA This policy template monitors the SSH daemon processes running on HP-UX operating systems. Service/Process Monitoring Template
Computer Sys_HPUXSyslogProcessMonitor NA This policy template monitors the Syslog daemon processes running on HP-UX operating systems. Service/Process Monitoring Template
Computer Sys_LinuxSshdProcessMonitor NA This policy template monitors the SSH daemon processes running on Linux operating systems. Service/Process Monitoring Template
Computer Sys_MSWindowsEventLogRoleMonitor NA This policy template monitors the availability of system services required for event log role service. Service/Process Monitoring Template
Computer Sys_MSWindowsFileServerRoleMonitor NA This policy template monitors the availability of system services required for files server role service. Service/Process Monitoring Template
Computer Sys_MSWindowsNetworkPolicyServerRoleMonitor NA This policy template monitors the availability of system services required for network policy server role service. Service/Process Monitoring Template
Computer Sys_MSWindowsTaskSchedulerRoleMonitor NA This policy template monitors the availability of system services required for task scheduler role service. Service/Process Monitoring Template
Computer Sys_MSWindowsWin2k3FileServicesRoleMonitor NA This policy template monitors the availability of system services required for Win2k3 files services role service. Service/Process Monitoring Template
Computer Sys_RHELSyslogProcessMonitor NA This policy template monitors the Syslog daemon processes running on RHEL operating systems. Service/Process Monitoring Template
Computer Sys_SLESSyslogProcessMonitor NA This policy template monitors the Syslog daemon processes running on SLES operating systems. Service/Process Monitoring Template
Computer Sys_SunSolarisSshdProcessMonitor NA

This policy template monitors the SSH daemon processes running on Sun Solaris operating systems.

Service/Process Monitoring Template
Computer Sys_SunSolarisSyslogProcessMonitor NA This policy template monitors the system log processes running on Sun Solaris operating systems. Service/Process Monitoring Template
Computer Sys_ProcessMonitor NA Monitor processes and process groups. Measurement Threshold
Computer Sys_ProcessMonitorConfig NA Configfile policy for system process monitoring. ConfigFile

Memory and Swap Utilization

The Memory and Swap Utilization Aspect monitors memory performance of the system. Memory performance monitoring is based on Memory utilization (in percentage), Swap space utilization (in percentage), Free memory available (in MBs), and Free swap space available (in MBs). This Aspect consists of the following policy templates:

CI Type Policy Template Indicator Description Policy Type
host_node Sys_MSWindowsNonPagedPoolUtilization-AT NA Multi-Instance baseline for memory Non-Paged pool Measurement Threshold
host_node Sys_MSWindowsPagedPoolUtilization-AT NA Multi-Instance baseline for Memory Paged pool Measurement Threshold
host_node Sys_MemoryUtilization-AT NA Global Memory Utilization Baseline Measurement Threshold
host_node Sys_SwapCapacityMonitor SwapUsageLevel Monitors the swap space utilization of the system. Measurement Threshold
host_node Sys_MemoryUsageAndPerformance MemoryUsageLevel Monitors the memory usage of the system and shows error rates and collisions to identify potential memory bottlenecks. Measurement Threshold
host_node Sys_SwapUtilization-AT NA Global Swap Space Utilization Baseline Measurement Threshold

Operations Agent Self Monitoring

Operation Agent Policies for self monitoring.

CI Type Policy Template Indicator Description Policy Type
host_node OA_SelfMonTstActa NA OA self monitor - test action agent Scheduled Task
host_node OA_SelfMonTstMonaExt NA OA self monitor - test monitor agent (external monitor) Measurement Threshold
host_node OA_SelfMonVerifyMon NA OA self monitor - verifies flag files by opcmona Measurement Threshold
host_node OA_SelfMonTstTrapi NA OA self monitor - test the SNMP trap interceptor SNMP Interceptor
host_node OA_SelfMonVerifyLe NA OA self monitor - verifies the flag files (opcle) Logfile Entry
host_node OA_SelfMonTstMsgi NA OA self monitor - test message interceptor Open Message Interface
host_node OA_SelfMonTstAll NA OA self monitor - test critical Operations Agent processes Measurement Threshold
host_node OA_SelfMonTstLe NA OA self monitor - test log file encapsulator LogFile Entry

Performance Collection Component

Alarm Def & message Interceptor policies.

CI Type Policy Template Indicator Description Policy Type
host_node OA_WindowsAlarmdefPolicy NA Alarmdef policy for performance collection component of HP Operation Agent on Windows platform. ConfigFile
host_node OA_AixParmPolicy NA Parm policy for performance collection component of HP Operation Agent on AIX platform. ConfigFile
host_node OA_SunOSParmPolicy NA Parm policy for performance collection component of HP Operation Agent on SunOS platform. ConfigFile
host_node OA_LinuxParmPolicy NA Parm policy for performance collection component of HP Operation Agent on Linux platform. ConfigFile
host_node OA_HpsensorConfPolicy NA hpsensor configuration policy for performance collection component of HP Operation Agent. ConfigFile
host_node OA_AixAlarmdefPolicy NA Alarmdef policy for performance collection component of HP Operation Agent on AIX platform. ConfigFile
host_node OA_SunOSAlarmdefPolicy NA Alarmdef policy for performance collection component of HP Operation Agent on SunOS platform. ConfigFile
host_node OA_HP-UXParmPolicy NA Parm policy for performance collection component of HP Operation Agent on HP-UX platform. ConfigFile
host_node OA_VMWareAlarmdefPolicy NA Alarmdef policy for performance collection component of HP Operation Agent on VMWare platform. ConfigFile
host_node OA_WindowsParmPolicy NA Parm policy for performance collection component of HP Operation Agent on Windows platform. ConfigFile
host_node OA_LinuxAlarmdefPolicy NA Alarmdef policy for performance collection component of HP Operation Agent on Linux platform. ConfigFile
host_node OA_PerfCollComp-opcmsg NA Default interceptor for messages sent by post ConfigFile policy deployment script of Performance Collection Component. Open Message Interface
host_node OA_VMWareParmPolicy NA Parm policy for performance collection component of HP Operation Agent on VMWare platform. ConfigFile
host_node OA_HP-UXAlarmdefPolicy NA Alarmdef policy for performance collection component of HP Operation Agent on HP-UX platform. ConfigFile

Remote Disk Space Utilization

The Remote Disk Space Utilization Aspect monitors space utilization of remote disk. This Aspect consists of the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
file_system host_node Sys_LinuxCifsUtilizationMonitor

NA

Monitors space utilization level for CIFS remote filesystems on Linux platforms. Measurement Threshold
file_system host_node Sys_LinuxNfsUtilizationMonitor

NA

Monitors space utilization level for NFS remote filesystems on Linux platforms. Measurement Threshold

Realtime Resource Bottleneck

This Aspect monitors the botteleneck situations with real time data.

CI Type Policy Template Indicator Description Policy Type
host_node Sys_MSWindowsRealTimeAlerts NA Log file policy for realtime alerts LogFile Entry
host_node Sys_SunSolarisRealTimeConfig NA Perfd Advisor script for SunOS ConfigFile
host_node Sys_AIXRealTimeConfig NA Perfd Advisor script for AIX ConfigFile
host_node Sys_LinuxRealTimeAlerts NA Log file policy for realtime alerts LogFile Entry
host_node Sys_HPUXRealTimeConfig NA Perfd Advisor script for HP-UX ConfigFile
host_node Sys_LinuxRealTimeConfig NA Perfd Advisor script for Linux ConfigFile
host_node Sys_MSWindowsRealTimeConfig NA Perfd Advisor script for Windows ConfigFile

Resource Bottleneck Diagnosis

This Aspect identifies congestion and bottleneck conditions for system resources like the CPU, memory, network, and disk. CPU bottleneck monitoring is based on global CPU utilization and load average (Run Queue Length) Memory bottleneck monitoring is based on memory utilization, free memory available, and memory swap out rate. Filesystem monitoring is based on space utilization level for busiest filesystem on the node. Network monitoring is based on Packet collision rate, packet error rate, and outbound queue length. This Aspect contains the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
Computer Sys_CPUBottleneckDiagnosis CPULoad

Detects CPU bottlenecks such as exceeding the thresholds for CPU utilization percentage, processor queue length, and total number of CPUs running on an operating system.

For example, if the threshold for CPU utilization is violated along with threshold for number of processes in the queue waiting for CPU time, the policy sends an alert. The message also displays a list of the top ten CPU utilization processes.

Measurement Threshold Template
Computer Sys_DiskPeakUtilMonitor DiskUsageLevel Monitors the utilization level of the disk on the system. It checks whether the utilization level is full. Measurement Threshold Template
Computer Sys_MemoryBottleneckDiagnosis MemoryLoad Monitors the physical memory utilization and the bottlenecks. Memory bottleneck condition occurs when the memory utilization is high and the available memory is very low. It causes the system to slow down affecting overall performance. High memory consumption results in excessive page outs, high page scan rate, swap-out byte rate, and page request rate, eventually slowing down the system. The message also displays a list of top ten memory utilization processes. Measurement Threshold Template
Computer Sys_NetworkInterfaceErrorDiagnosis SESSION Monitors the network usage of the system and checks for potential network bottlenecks or errors. Measurement Threshold Template

Server Hardware Fault

This Aspect monitors the health and status of the ProLiant servers. These policies monitor the Simple Network Management Protocol SNMP traps generated by the SIM Agent and send alert messages to the HPOM console. All these policies are of the type SNMP Interceptor. This Aspect consists of the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
Computer Sys_HPProLiant_BladeType2Traps NA This policy intercepts SNMP traps related to Blade Type 2. SNMP Interceptor Template
Computer Sys_HPProLiant_CPQCLUSTraps NA This policy intercepts SNMP traps related to clusters in terms of the state of the battery, monitor, Hot Plug Slot Board, memory, and hood.
Computer Sys_HPProLiant_CPQCMCTraps NA This policy intercepts SNMP traps related to the health of the Console Management Controller (CMC) in terms of power consumption, smoke, humidity, temperature, and fan. SNMP Interceptor Template
Computer Sys_HPProLiant_CPQHLTHTraps NA This policy intercepts SNMP traps related to the health of the server.
Computer Sys_HPProLiant_CPQNICTraps NA This policy intercepts SNMP traps related to the performance and availability of the Network Interface Card (NIC).
Computer Sys_HPProLiant_CPQRackTraps NA This policy intercepts SNMP traps related to rack information in terms of temperature, power, and status. SNMP Interceptor Template
Computer Sys_HPProLiant_CPQRCTraps NA This policy intercepts SNMP traps related to the performance and availability of the RAID Controller.
Computer Sys_HPProLiant_CPQRPMTraps NA This policy intercepts SNMP traps related to Rack Power Manager.
Computer Sys_HPProLiant_CPQSSTraps NA This policy intercepts SNMP traps related to storage systems in terms of fan status, temperature, and power supply. SNMP Interceptor Template
Computer Sys_HPProLiant_CPQSysInfoTraps NA This policy intercepts SNMP traps related to system information in terms of the state of the battery, monitor, Hot Plug Slot Board, memory, and hood.
Computer Sys_HPProLiant_CPQUPSTraps NA This policy intercepts SNMP traps related to Uninterrupted Power Supply (UPS) in terms of status, battery, and actions initiated by UPS.
Computer Sys_HPProLiant_FwdDriveArrayTraps NA This policy intercepts SNMP traps related to Compaq’s Intelligent Drive Array. SNMP Interceptor Template
Computer Sys_HPProLiant_VCDomainTraps NA This policy intercepts SNMP traps related to virtual connect domain.
Computer Sys_HPProLiant_VCModuleTraps NA This policy intercepts the SNMP trap related to virtual connect module.

Space Availability and Disk IOPS

This Aspect monitors the disk IO operations and space utilization of the system. This Aspect consists of the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
Computer, Filesystem Sys_FileSystemUtilizationMonitor NA Monitors the disk capacity of logical filesystems Measurement Threshold
Computer, Filesystem Sys_PerDiskUtilization-AT NA Requires HP Performance Agent to be running on node - Multi-Instance Baseline for Disk Measurement Threshold
Computer, Filesystem Sys_PerDiskAvgServiceTime-AT NA Monitors Disk I/O service time -Requires HP Performance Agent to be running on node Measurement Threshold

System Infrastructure Discovery

This Aspect discovers and gathers information regarding the system resources, operating system, and applications on a managed node. This Aspect contains the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
Computer OPC_PERL_INCLUDE_INSTR_DIR NA Is used for setting OPC_PERL_INCLUDE_INSTR_DIR in the HP Operations Agent xpl config namespace. Set the value to TRUE for Infrastructure SPI policies to work. Node Info Template
Computer Sys_SystemDiscovery NA

Gathers service information from the managed nodes such as hardware resources, operating system attributes, and applications.

Service Auto-Discovery Template

System Fault Analysis

This Aspect monitors the kernel log file, boot log file, and event log file for critical error conditions and instructions. This Aspect contains the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
Computer Sys_LinuxKernelLog NA This policy template monitors the kernel log file /var/log/ and alerts in case of any kernel service failure. It checks for error conditions that match the <*> kernel: <@.service>: <*.msg> failed pattern in the kernel log file. If any matches are found, this condition sends an alert with minor severity. Logfile Entry
Computer Sys_LinuxBootLog NA

This policy template monitors the boot log file /var/log/boot.log and alerts in case of any system boot errors. It checks for the following conditions:

  • Service startup failed - Checks for error conditions that match the <*> <@.service>: <@.daemon> startup failed pattern in the boot log file. If any matches are found, this condition sends an alert with minor severity.
  • Service failed - Checks for error conditions that match the <*> <@.service>: <*.msg> failed pattern in the log file. If any matches are found, this condition sends an alert with critical severity.

Logfile Entry
Computer Sys_LinuxSecureLog NA This policy template alerts the user in case of any secure login failure. It checks for the error conditions that match the <*> sshd : Failed password for <@.user> from <*.host> port <#> ssh2 pattern. If any matches are found, this condition sends an alert with warning severity. Logfile Entry
Computer Sys_AIXErrptLog NA This policy template monitors the errpt log file /var/opt/OV/tmp/sispi/errpt.log and generates an error report from entries in an error log. It checks for error conditions that match <@.errcode> <2#.mo><2#.dd><2#.hh><2#.mm><2#.yy> <@> <@> <@.object> <*.msgtext> each column in the errpt log file. If any matches are found, this condition sends an alert with warning severity. Logfile Entry
Computer Sys_MSWindowsServer_DNSWarnError NA

This policy template monitors the log file for the Microsoft DNS server service and its corresponding process and forwards the error log entries with a warning, or error severity. The policy looks for the following errors recorded in the DNS log file:

  • The DNS server could not allocate memory for the resource record.
  • The DNS server was unable to service a client request due a shortage of available memory.
  • The DNS server could not create a zone transfer thread.
  • The DNS server encountered an error while writing to a file.
  • The DNS server could not initialize the remote procedure call (RPC) service.
Windows Event Log
Computer Sys_MSWindowsServer_DHCPWarnError NA

This policy template monitors the DHCP event logs and forwards the event log entries with warning, or error severity. The policy looks for the following errors:

  • Iashlpr cannot contact the NPS service.
  • There are no IP addresses available for BOOTP clients in the scope or superscope.
  • The DHCP server is unable to reach the NPS server for determining the client's NAP access state.
  • There are no IP addresses available for lease in the scope or superscope.
  • The DHCP service failed to initialize the audit log.
  • The DHCP/BINL service on the local computer has determined that it is not authorized to start.
  • The DHCP/BINL service on this workgroup server has encountered another server with IP Address.
  • The DHCP service failed to restore the DHCP registry configuration.
  • The DHCP service was unable to read the global BOOTP file name from the registry.
  • The DHCP service is not servicing any clients because there are no active interfaces.
  • There is no static IP address bound to the DHCP server.
  • The DHCP Server service failed to register with Service Controller.
  • The DHCP Server service failed to initialize its registry parameters.
Windows Event Log
Computer Sys_MSWindowsServer_NFSWarnError NA

This policy template monitors the NFS event logs and forwards the event log entries with warning, or error severity. The policy looks for the following errors:

  • Server for NFS detected a low disk space condition and has stopped recording audits.
  • The audit log has reached its maximum file size.
  • Server for NFS could not register with RPC Port Mapper.
  • The Server for NFS received a failure from the NFS driver during phase 2 initialization.
Windows Event Log
Computer Sys_MSWindowsServer_TerminalServiceWarnError NA

This policy template forwards the terminal service event logs entries with warning, or error severity. The policy looks for the following errors:

  • A connection request was denied because the terminal server is currently configured to not accept connections.
  • Autoreconnect failed to reconnect user to session because authentication failed.
  • Terminal Service start failed.
  • The terminal server received large number of incomplete connections.
Windows Event Log
Computer Sys_MSWindowsServer_WindowsLogonWarnError NA

This policy template monitors the Windows logon and initialization event logs and forwards the error log entries with warning, or error severity. The policy looks for the following errors recorded in the Windows log file:

  • Windows license is invalid.
  • Windows license activation failed.
  • The Windows logon process has failed to switch on the desktop.
  • The Windows logon process has unexpectedly terminated.
  • The Windows logon process has failed to spawn a user application.
  • The Windows logon process has failed to terminate currently logged on user's processes.
  • The Windows logon process has failed to disconnect the user session.
Windows Event Log
host_node Sys_LINUXBadLogins NA Monitors bad logins in LINUX platform. LogFile Entry
host_node Sys_AIXSu NA Monitors the AIX log file definition of su log file. LogFile Entry
host_node Sys_HPUXBadLogs NA Monitors history of bad logins (/var/adm/btmp logfile) in HP-UX 10.x/11.x platform. LogFile Entry
host_node Sys_SunSolarisSyslog NA NFS server ok [SISPI-SOL-syslog.41] LogFile Entry
host_node Sys_AIXBadLogs NA Monitors the history of failed logins in AIX platform. LogFile Entry
host_node Sys_AIXLogins NA Monitors the history of logins and logouts [OSSPI-AIX-Logins_1] in AIX platform. LogFile Entry
host_node Sys_MSWindowsServer_DNSWarnError NA Forwards DNS event logs entries with Warning or Error severity. Windows Event Log
host_node Sys_HPUXsyslog NA Monitors messages passed into the syslog.log file. LogFile Entry
host_node Sys_HPUXLogins NA Monitors the history of logins and logouts (/var/adm/wtmp logfile) in HP-UX 10.x/11.x platform. LogFile Entry
host_node Sys_AIXSyslog NA Monitors messages concerning services and passed into syslog.log file. LogFile Entry
host_node Sys_SunSolarissnmplog NA Monitors SNMP log file entries. LogFile Entry
host_node Sys_SunSolarisBadLogs NA Monitors the history of failed logins (/var/adm/loginlog) in Solaris platform. LogFile Entry
host_node Sys_SunSolarisLogins NA Monitors the history of logins and logouts (/var/adm/wtmpx file) in Solaris platform. LogFile Entry
host_node Sys_HPUXSu NA Monitors the switch user events in log file (/var/adm/sulog) in HP-UX 10.x/11.x platform. LogFile Entry

System Log Collection

Collects data from various system log files and forwards to log receiver target servers in JSON format. Before deployment, set the URL in Sys_DataForwarding policy. For more information, see Configure Streaming of Logs.

CI Type Policy Template Indicator Policy Description Policy Type
Computer Sys_SyslogStreaming

NA

Normalizes the syslog messages and converts to JSON data. The list of log files are read from /etc/rsyslog.conf or /etc/syslog.conf (based on the OS, version). For example, /var/log/messages, /var/log/cron, /var/log/maillog Generic Output from Structured Log File
Computer Sys_ApplicationLog NA This Policy collects data from Windows Event Application Log Generic output from Windows Event Log
Computer Sys_SecurityLog NA This Policy collects data from Windows Event Security Log Generic output from Windows Event Log
Computer Sys_SystemLog NA This Policy collects data from Windows Event System Log Generic output from Windows Event Log
Computer Sys_CustomLog NA This is a parametrized policy for Windows Event log. Define custom log and enter the Log file path as parameter for forwarding the data to the required server. Generic output from Windows Event Log
Computer Sys_DataForwarding

NA

Forwards the log data collected by 'Generic Output from Structured Log File' policies to target log receiver server. Before deployment, enter valid 'URL' under 'Targets > Forwarding Target Properties' tab. DataForwarding

User Logins

This Aspect checks the number of failed logins and last logins on your system. This Aspect consists of the following policy templates:

CI Type Policy Template Indicator Policy Description Policy Type
Computer Sys_MSWindowsFailedLoginsCollector NA

This policy checks for the number of:

  • Failed login attempts on Microsoft Windows.
  • Invalid logins, either due to unknown username or incorrect password on the managed node.
  • The policy logs individual instances of failed login into the GBL_NUM_FAILED_ LOGINS metric in EPC. By default, the time interval is 1 hour.
Scheduled Task Template
Computer Sys_MSWindowsLastLogonsCollector NA

This policy template performs the following tasks:

  • Checks for the logon details of all the active local user accounts on Microsoft Windows.
  • Logs individual instances of user logon into the SECONDS_ SINCE_LASTLOGIN metric in EPC. By default, the time interval is 1 hour.
Scheduled Task Template
Computer Sys_UNIXFailedLoginsCollector NA

This policy checks for the number of:

  • Failed login attempts on RHEL and SLES Linux systems, HP-UX, AIX, and Solaris operating systems.
  • Invalid logins, either due to unknown username or incorrect password on the managed node.
  • The policies log individual instances of failed login into the GBL_NUM_FAILED_LOGINS metric in EPC. By default, the time interval is 1 hour.
Scheduled Task Template
Computer Sys_LinuxLastLogonsCollector NA This policy checks for the logon details of all the active local user accounts on RHEL and SLES Linux operating systems. The policy logs individual instances of user logon into the SECONDS_SINCE_LASTLOGIN metric in EPC. By default, the time interval is 1 hour. Scheduled Task Template

Note You must have the following pre-requisites for the Sys_UNIXFailedLoginsCollector policy to function correctly when deployed on the Solaris node:

  • Set the following variables in /etc/default/login file

    SYSLOG=YES

    SYSLOG_FAILED_LOGINS=1

  • In /etc/syslog.conf file, check if the following line is present:

    auth.notice ifdef(LOGHOST', /var/log/authlog, @loghost)

  • Refresh syslogd using the following command:

    svcadm refresh system/system-log

Sys_UNIXFailedLoginsCollector policy is deployed in the following paths:

On Solaris nodes: /var/log/authlog

On Linux nodes: lastb command

On HP-UX nodes: lastb command

On AIX nodes: /etc/security/failedlogin log

VMWareVC Self monitoring

Self monitoring policies for VMWare vCenter.

CI Type Policy Template Indicator Description Policy Type
host_node VMwareVC_SelfMonMemoryUsage NA SelfMon policy - uses OM Agent ProcessDetails() API. Tracks memory usage of Videmon and Opsagt processes. Measurement Threshold
host_node VMwareVC_SelfMonDBCorruptionMonitor NA This policy monitors corruption of Operations Agent datastore disk image and helps in its recovery. LogFile Entry
host_node VMwareVC_SelfMonDiskUsage NA Tracks Disk usage of OA database. Measurement Threshold
host_node VMwareVC_SelfMonCPUUsage NA SelfMonpolicy - uses OM Agent ProcessDetails() API. Tracks CPU usage of Videmon opsagt processes. Measurement Threshold

Related Topics

For more information about how baselining is computed, see the section Overview of Baselining in the Operations Agent (12.00 or later) documentation.