Searching the Help
To search for information in the Help, type a word or phrase in the Search box. When you enter a group of words, OR is inferred. You can use Boolean operators to refine your search.
Results returned are case insensitive. However, results ranking takes case into account and assigns higher scores to case matches. Therefore, a search for "cats" followed by a search for "Cats" would return the same number of Help topics, but the order in which the topics are listed would be different.
Search for | Example | Results |
---|---|---|
A single word | cat
|
Topics that contain the word "cat". You will also find its grammatical variations, such as "cats". |
A phrase. You can specify that the search results contain a specific phrase. |
"cat food" (quotation marks) |
Topics that contain the literal phrase "cat food" and all its grammatical variations. Without the quotation marks, the query is equivalent to specifying an OR operator, which finds topics with one of the individual words instead of the phrase. |
Search for | Operator | Example |
---|---|---|
Two or more words in the same topic |
|
|
Either word in a topic |
|
|
Topics that do not contain a specific word or phrase |
|
|
Topics that contain one string and do not contain another | ^ (caret) |
cat ^ mouse
|
A combination of search types | ( ) parentheses |
|
Lock management
The locking mechanism in versions of Service Manager (SM) prior to version 9.31 had a locking mechanism that used a multicasting to request and obtain a lock on a resource in Service Manager. This locking mechanism was implemented using a Peer Lock in the JGroups toolkit. However, there were several issues with this implementation, which are addressed with the new locking mechanism introduced in this document.
Issues in Service Manager Multicast Communication
Service Manager’s previous multicast implementation suffers from the following limitations:
• All nodes must be contacted by the node that is requesting the resource. This leads to high overhead per request.
• All nodes must give approval to a request, regardless of whether a node is using a resource or not.
• A node that does not respond represents a single point of failure for the system, even if that node has no other connection to the resource or the requesting node.
• If one node does not respond, the request must be re-issued by the originating node, which increases overhead even further.
• Nodes that are slow to respond will eventually be removed from the node cluster.
• Scalability is very poor.
• The potential for netstorms is high (wherein every node is attempting to request permission from every other node, leading to n^2 requests).
New Locking Mechanism Overview
The new locking mechanism consists of a record entry for each locked resource in a database table. The new Lock table (for exclusive locks) and LockShared table (for shared locks) have been created to house these records. See the following tables for details on the structure and fields of the Lock and LockShared tables.
Note: The only difference between these two tables is the value in the TYPE field, and the primary key for the LockShared table is a combination of the LOCKID, pID, tID, and IP fields.
Field | Type | Null | Key | Default | Extra |
---|---|---|---|---|---|
LOCKID | varchar(600) | N | PK | LockID is a hex value of ResourceName. This ensures that the ResourceName field can be either case-sensitive or case-insensitive. | |
RESOURCENAME | varchar(200) | N | Logic lock ID | ||
TYPE | char(1) | Exclusive / Shared lock | |||
PID | float | The Process ID that holds the lock. | |||
TID | float | The Thread ID that holds the lock. | |||
RADTHREADID | float | The Rad Thread ID that holds the lock. | |||
SESSIONID | float | The session ID that holds the lock. | |||
REASON | VARCHAR2(60) | Application private | |||
USER | VARCHAR2(60) | SM login user (For example, "falcon") | |||
HOSTNAME | VARCHAR2(60) | The host that holds the lock. | |||
IP | VARCHAR2(60) | The IP address of the host that holds the lock. | |||
DEVICENAME | VARCHAR2(60) | The device that holds the lock. | |||
For the bg scheduler thread, the devicename is "SYSTEM" | |||||
LOCKAT | datetime | When the lock is obtained. | |||
STARTAT | datetime | When the lock was requested. | |||
RETRYCOUNT | float | Number of times the lock was requested. | |||
HEARTBEAT | float | Updated periodically by the lock holder. | |||
SUSPECTED | float | Indicates another node suspects the lock holder has failed. | |||
SYSRESTRICTED | CHAR(1) | Indicates no modification allowed for end-users | |||
SYSMODCOUNT | float | ||||
SYSMODUSER | VARCHAR(60) | ||||
SYSMODTIME | datetime |
Field | Type | Null | Key | Default | Extra |
---|---|---|---|---|---|
LOCKID | varchar(600) | N | PK | LockID is a hex value of ResourceName. This ensure that the ResourceName field can be either case-sensitive or case-insensitive. | |
RESOURCENAME | varchar(200) | N | Logic lock ID | ||
TYPE | char(1) | N | PK | Shared lock | |
PID | float | PK | The Process ID that holds the lock. | ||
TID | float | PK | The Thread ID that holds the lock. | ||
RADTHREADID | float | The Rad Thread ID that holds the lock. | |||
SESSIONID | float | The session ID that holds the lock. | |||
REASON | VARCHAR2(60) | Application private | |||
USER | VARCHAR2(60) | SM login user (For example, "falcon") | |||
HOSTNAME | VARCHAR2(60) | The host that holds the lock. | |||
IP | VARCHAR2(60) | The IP address of the host that holds the lock. | |||
DEVICENAME | VARCHAR2(60) | The device that holds the lock. | |||
For the bg scheduler thread, the devicename is "SYSTEM" | |||||
LOCKAT | datetime | When the lock is obtained. | |||
STARTAT | datetime | When the lock was requested. | |||
RETRYCOUNT | float | Number of times the lock was requested. | |||
HEARTBEAT | float | Updated periodically by the lock holder. | |||
SUSPECTED | float | Indicates another node suspects the lock holder has failed. | |||
SYSRESTRICTED | CHAR(1) | Indicates no modification allowed for end-users | |||
SYSMODCOUNT | float | ||||
SYSMODUSER | VARCHAR(60) | ||||
SYSMODTIME | datetime |
Lock Behavior
Locks may be either shared locks or exclusive locks. A shared lock allows multiple nodes to read the data from a resource. An exclusive lock may only be obtained if any shared locks on a resource have been released by the nodes that hold them.
Exclusive Lock
The process by which a node obtains an exclusive lock is as follows:
- A node that requests an exclusive lock tries to insert a record into the Lock database table to see whether a resource is available.
- If there is no record for this resource in the Lock table, the resource is available and the node holder obtains the lock and inserts a record into the Lock table.
- If there is a record in the Lock table, the node will fail to obtain a lock.
To unlock an exclusive lock, the corresponding lock record is removed from the Lock table.
Shared Lock
The process by which a node obtains a shared lock is as follows:
- A node that requests a shared lock tries to insert a record into the Lock database table to see whether a resource is available.
- If there is no record for this resource in the Lock table, the node holder inserts a record into the Lock table and proceeds to step 4.
- If there is a record in the Lock table, the TYPE field of that record is checked. If the TYPE field is “Exclusive” the lock requested is rejected. If the TYPE field is “Shared,” proceed to step 4.
- The node tries to insert a record into the LockShared table. If the insertion is successful, the shared lock request is granted. If the insertion fails, the shared lock request is rejected.
To unlock a shared lock, the corresponding lock record is removed from the LockShared table. Then, the corresponding lock record is removed from the Lock table unless other records share the same resource ID.
Note: There is no mechanism to escalate a Shared lock to an Exclusive lock. To obtain an exclusive lock on a resource, all shared locks must be released.
Lock Retry, Timeout, and Heartbeat
Each process that requires a lock has a dedicated LockHandler thread to handle all lock related operations. When a process needs to execute a lock/unlock operation, the process places a request in queue. The LockHandler reads the queue and attempts to insert the appropriate records in to the Lock or LockShared tables. Additionally, the LockHandler will also return the response to the process that has requested the lock.
The LockHandler will attempt to retry a “wait” lock request every two seconds until the lock attempt succeeds. If a “no-wait” lock request is specified, the LockHandler will reattempt lock acquisition immediately.
- It is possible that a node may obtain a lock, and then fail to release the lock for several reasons. For example, the node could fail, or a problem with the network may prevent communication between the nodes. To prevent other nodes from waiting for an unresponsive lock owner, the following heartbeat mechanism has been implemented:
- When a process requests a lock and finds the lock is already held and the heartbeat value has not been updated, it sets the “Suspected” flag of the lock record in the database table to 1.
- The lock owner updates the heartbeat and checks the “Suspected” flag of the lock record in the database table every 60 seconds.
- If the “Suspected” flag is set to 1, the lock owner sets the “Suspected” flag back to 0 and then increases the heartbeat.
- If the lock owner has failed to update the “Suspected” flag and heartbeat, the process that is waiting will periodically recheck the “Suspected” flag. After 10 minutes (the default value specified in the deadnodelocktimeout parameter), the lock is forcibly removed by deleting the lock record from the database table.
New Parameter
This new locking mechanism implements the deadnodelocktimeout parameter. This parameter specifies the amount of time that must elapse before a process forcibly removes a lock from the Lock or LockShared table. By default, this parameter is set to 10, which indicates that 10 minutes must elapse before a record is forcibly removed. 10 minutes is also the minimum value for this parameter.
This parameter is specified in the sm.ini file. Changing the value of this parameter does not require a restart of the server.