Administer > Server performance tuning > Scheduled restart of Service Manager processes

Scheduled restart of Service Manager processes

The restart command allows administrators to schedule the restart of one or all Service Manager processes on a host. Restarting one or more processes allows Service Manager to offer high availability without having to restart the entire Service Manager cluster. Typically, administrators want to restart Service Manager processes for one of the following reasons:

  • The Service Manager system is running low on system resources while at normal load
  • A particular Service Manager processes is consuming a large amount of system resources
  • The administrator has some need to regularly restart Service Manager processes

These symptoms may arise from various causes such as faulty tailoring changes or unstable third-party libraries that increase memory consumption on your system. Restarting one or all Service Manager processes allows you to temporarily workaround any performance issues while you diagnosis the root cause and determine a permanent fix.

The restart command has two basic modes of operation:

  • Restart a particular process
  • Restart all processes on a host

Restart a particular process

If you can identify the processes comsuming system resources, you can restart them individually by process ID with the pid parameter. A process restart uses the following workflow:

  1. The restart command notifies the process of a restart request
  2. The process operates normally during the restart waiting period
  3. The process goes into quiesce mode during the restart grace interval
  4. The process restarts

You can schedule when a Service Manager process restarts with the value of the restart command. The value of the restart command determines the minimum amount of minutes a process waits before restarting. By default, there is no restart waiting period value so a process enters restart immediately.

During the restart waiting period, an administrator can cancel the restart by reissuing the restart command with a value of -1. You can specify the same process ID you originally listed with the pid parameter to cancel a process restart. You can only cancel a restart command during the restart waiting period. After a process is in the grace interval, you can no longer cancel a restart command.

After the restart waiting period has expired, the server checks to see if there is a restartGraceInterval value. This parameter determines how long the process will be quiesced and not accept any new connections (not even administrator accounts). This allows users currently connected to the process to complete their work and log off prior to the restart. Service Manager displays the following message to users during the grace interval:

Your Service Manager session is shutting down in %d minutes for maintenance. Please save your work and log out. You can log in again immediately.

Service Manager replaces the variable %d with the restartGraceInterval value. You can edit or localize the restart message from the notification engine.

By default, there is no restartGraceInterval value so all processes immediately restart. If you provide a restartGraceInterval, the process will remain quiesced for a number of minutes up to the value supplied or until all users log off, whichever comes first. If either condition is met the process restarts.

Note: Processes dedicated to Web Services connections or background schedulers ignore the restartGraceInterval value because they are stateless connections that do not require a user to log off. These processes restart immediately after the restart waiting period expires. After restart, the server will only run the background scheduler processes listed in the sm.cfg file or started from the OS command prompt. Background schedulers started from the System Status form are not automatically resumed.

Restart all processes on a host

If you cannot identify particular Service Manager processes that are consuming system resources, you can restart all Service Manager processes on a host with the host parameter.

Note: A host restart command does not restart the load balancer process. The only way to restart a load balancer process is to specify it by process ID.

A host restart uses the following workflow:

  1. The restart command notifies all processes on the host of the restart request
  2. The restart command randomly assigns a time extenstion to the restart time of each process
  3. The processes operates normally during the restart waiting period
  4. The processes go into quiesce mode during the restart grace interval
  5. The process restarts

When Service Manager processes receive a restart command they first calculate a future restart time based on the values provided with the restart command and the restartRandMax sm.ini parameter. The value of the restart command determines the minimum amount of time a process waits before restarting. By default, there is no restart waiting period value so the server randomly assigns a time extension. The value of the restartRandMax parameter extends the restart waiting period by a random amount of minutes from 0 to the value provided in the command.

The purpose of the random restart time extension is to minimize the chance that two or more processes restart at the same time since each process that restarts briefly reduces system capacity. If you are only restarting one process, there is no need to stagger restart times to preserve capacity and therefore the server ignores any restartRandMax value.

During the restart waiting period, an administrator can cancel the restart by reissuing the restart command with a value of -1. You can only cancel a restart command during the restart waiting period. After a process is in the grace interval, you can no longer cancel a restart command.

After the restart waiting period has expired, the server checks to see if there is a restartGraceInterval value. This parameter determines how long the process will be quiesced and not accept any new connections (not even administrator accounts). This allows users currently connected to the process to complete their work and log off prior to the restart. Service Manager displays the following message to users during the grace interval:

Your Service Manager session is shutting down in %d minutes for maintenance. Please save your work and log out. You can log in again immediately.

Service Manager replaces the variable %d with the restartGraceInterval value. You can edit or localize the restart message from the notification engine. It is IDS_ALERT_RESTART message 126.

By default, there is no restartGraceInterval value so all processes immediately restart. If you provide a restartGraceInterval, the process will remain quiesced for a number of minutes up to the value supplied or until all users log off, whichever comes first. If either condition is met the process restarts.

Note: Processes dedicated to Web Services connections, background schedulers, or the load balancer process ignore the restartGraceInterval value because they are stateless connections that do not require a user to log off. These processes restart immediately after the restart waiting period expires. After restart, the server will only run the background scheduler processes listed in the sm.cfg file or started from the OS command prompt. Background schedulers started from the System Status form are not automatically resumed.

Recommendations

While a process is quiesced, the system loses some connection capacity because there are fewer processes to accept connection requests. Restarting one Service Manager process typically takes a short amount of time, approximately 30 seconds depending upon the system. In general, quiescing processes reduces capacity for longer than restarting them (several minutes quiesce time compared to 30 seconds restart time). For this reason, we recommend you set a high restartRandMax value to minimize the chance of two processes restarting at the same time, and a low restartGraceInterval value to minimize the amount of time your system has reduced capacity from quiesced processes.

If you want to regularly schedule the restart of one or all of your Service Manager processes, you must use your operating system's scheduling tools to run the restart command.

Example: Restart all processes on a host

The following example illustrates how to restart all processes on a Service Manager host because the system is running out of memory. This scenario uses the following system configuration:

System property Value
Number of hosts 3
Total Service Manager processes 21
Threads per process 40
Maximum number of concurrent users expected 700
Maximum user capacity 840

This horizontally scaled configuration can support 700 concurrent users with an extra 20% capacity to handle high usage and outages.

Problem

The system administrator recently applied some new JavaScripts and since then has noted out of memory exceptions in the sm.log file. All Service Manager processes have high memory usage even when only at half capacity (20 concurrent user connections per process).

Recommendation

While the root cause of the high memory usage is being determined, the system administrator can schedule the restart the Service Manager processes on each host to occur periodically. Restarting the processes will temporarily free up system memory until the next system maintenance down time or until the root cause of the problem is identified and fixed.

The administrator uses an operating system scheduler to run the following commands at midnight every three days:

sm -restart:0 -host:<host1>
sm -restart:180 -host:<host2>
sm -restart:360 -host:<host3>

These commands cause host1 to restart immediately, host2 to restart in three hours, and host3 to restart in six hours. To avoid having all Service Manager processes on a host quiesced at one time, the administrator adds the following parameters to the sm.ini file on each host:

restartRandMax:60
restartGraceInterval:15

The restartRandMax:60 value ensures that each host finishes restarting all Service Manager process within one hour of receiving the restart command. In this case, host1 restarts anywhere from 0 to 60 minutes after receiving the restart command; host2 restarts after 180 to 240 minutes, and host3 restarts after 360 to 420 minutes. The restartGraceInterval:15 value ensures that users on each Service Manager process have 15 minutes to save their work and re-login before the process restarts. Users who log off one process can immediately re-login and continue work on another available process.

Note: A host restart command does not restart the load balancer process. The only way to restart a load balancer process is to specify it by process ID. The system cannot accept new connection requests until after the load balancer process restarts.

Example: Restart one process

The following example illustrates how to restart one processes on a Service Manager host because it is running consuming a high amount of system resources. This scenario uses the following system configuration:

System property Value
Number of hosts 3
Total Service Manager processes 21
Threads per process 40
Maximum number of concurrent users expected 700
Maximum user capacity 840

This horizontally scaled configuration can support 700 concurrent users with an extra 20% capacity to handle high usage and outages.

Problem

The system administrator notes that one Service Manager process is consuming a large amount of system resources such as CPU time or system memory.

Recommendation

While the root cause of the issue is being determined, the system administrator can schedule the restart of the Service Manager process. Restarting the process will temporarily free up system memory until the next system maintenance down time or until the root cause of the problem is identified and fixed.

The administrator uses the following command to restart the affected process:

sm -restart:0 -host:15.80.177.12 -pid:3433

This command causes process ID 3433 on the identified host to restart immediately. To provide the currently connected users time to save their work, the administrator adds the following parameters to the sm.ini file:

restartGraceInterval:15

The restartGraceInterval:15 value ensures that users on the Service Manager process have 15 minutes to save their work and re-login before the process restarts. Users who log off one process can immediately re-login and continue work on another available process.

Note: A host restart command does not restart the load balancer process. The only way to restart a load balancer process is to specify it by process ID. The system cannot accept new connection requests until after the load balancer process restarts.

 

Related topics

Parameter: restart

Parameter: restartGraceInterval

Parameter: restartRandMax