When to Use Federation or Population

In many Configuration Management Systems, configuration data is scattered over multiple Management Data Repositories (MDR) around the organization. Each MDR is part of a management application provided by an IT system vendor. MDRs may include very rich configuration data for the systems they manage. In the most optimistic scenario, a UCMDB instance does not copy or ’populate’ any of the data stored in the MDRs, rather the UCMDB gains access to all CI data via federation.

In some cases, a pure federated approach might be possible, but in other cases, this scenario becomes impractical. The following issues may contribute to the poor performance of a distributed Configuration Management System:

  • It is not clear whether all relevant MDRs are available at the time of TQL query execution, particularly if some of the MDRs lack High Availability capability. If an MDR which is necessary for the execution of a query is not available, UCMDB fails to return results for the query because the completeness and correctness of the query result is unknown.
  • The response time of the execution of such a federated TQL query may depend on slow MDRs and WAN interfaces.
  • When a group of UCMDB users try to access a federated data source from MDRs scattered over multiple, distant, geographical locations, it can overload the network (WAN) of a distributed organization. This is especially true for large groups of users such as a group of Service Manager (SM) Incident Manager users.
  • Large organizations often have a large group of CMS or SM users. Their MDRs may become overloaded by UCMDB since they are usually built to serve a limited set of administrative users. Some of the MDRs may be vital for the organization. For example, if an LDAP server is used as a federated data source for personal data, there is a danger that the server will be overloaded by federated queries. Applications using the LDAP server for authentication requests could be delayed or completely blocked.

Federation vs Population Factors

It is often necessary to populate some portion of the MDRs' data into UCMDB instances. The balance of the data can be federated. The percentage of populated versus federated data is dependent on many factors such as:

  • Dynamic nature of the data. If the data in the MDR is updated frequently and the changes need to be visible to the user, then federation is preferred. Examples of this could include the failover of a Cluster Resource Group or the v-motion of a virtual machine. In these cases, the dynamic component is the relationship that connects the VM or CRG to the physical node or cluster node and only this data should be federated. It is not necessary to federate all the detailed data about the node, VM or CRG, only the reconciliation data for these entities and the relationship that connects them.
  • Importance of history. If you need to store history of dynamic data, then federation is not the correct approach as history is not captured for federated data. If the MDR can publish change events, consider migrating these change events into UCMDB via an integration infrastructure such as your organization's Enterprise Service Bus (ESB). The ESB integration component could use the UCMDB Java SDK to push the change events into UCMDB.
  • Size of the user group that consumes the data. If the user group that consumes the data via federated queries is large, the MDRs may be overloaded with too many queries. In this case, population of a higher percentage of the data is preferred.
  • Scope of federated queries and MDR response time. If a federated TQL query needs access to many data sources to execute, then the response time of the query may be long. Replication is preferred, especially when the MDR response time is slow.
  • High Availability of MDRs. If the MDRs are not highly available but the data they store is required for processes with HA requirements, then population is preferred because UCMDB is highly available. This is because if the data is federated and one MDR needed for the execution of a TQL query is not available, the UCMDB fails to return the query results even if other MDRs necessary for the query execution are available.
  • Network bandwidth. If frequent access to remote MDRs involves access via an expensive WAN, then the percentage of populated data from such MDRs should be increased.