Example: Restart all processes on a host

The following example illustrates how to restart all processes on a Service Manager host because the system is running out of memory. This scenario uses the following system configuration:

System property Value
Number of hosts 3
Total Service Manager processes 21
Threads per process 40
Maximum number of concurrent users expected 700
Maximum user capacity 840

This horizontally scaled configuration can support 700 concurrent users with an extra 20% capacity to handle high usage and outages.

Problem

The system administrator recently applied some new JavaScripts and since then has noted out of memory exceptions in the sm.log file. All Service Manager processes have high memory usage even when only at half capacity (20 concurrent user connections per process).

Recommendation

While the root cause of the high memory usage is being determined, the system administrator can schedule the restart the Service Manager processes on each host to occur periodically. Restarting the processes will temporarily free up system memory until the next system maintenance down time or until the root cause of the problem is identified and fixed.

The administrator uses an operating system scheduler to run the following commands at midnight every three days:

sm -restart:0 -host:<host1>
sm -restart:180 -host:<host2>
sm -restart:360 -host:<host3>

These commands cause host1 to restart immediately, host2 to restart in three hours, and host3 to restart in six hours. To avoid having all Service Manager processes on a host quiesced at one time, the administrator adds the following parameters to the sm.ini file on each host:

restartRandMax:60
restartGraceInterval:15

The restartRandMax:60 value ensures that each host finishes restarting all Service Manager process within one hour of receiving the restart command. In this case, host1 restarts anywhere from 0 to 60 minutes after receiving the restart command; host2 restarts after 180 to 240 minutes, and host3 restarts after 360 to 420 minutes. The restartGraceInterval:15 value ensures that users on each Service Manager process have 15 minutes to save their work and re-login before the process restarts. Users who log off one process can immediately re-login and continue work on another available process.

Note: A host restart command does not restart the load balancer process. The only way to restart a load balancer process is to specify it by process ID. The system cannot accept new connection requests until after the load balancer process restarts.