A Tiered Structure for Governing and Managing Large-Scale Directories

Articles and Tips: article

David Guest
Consultant
Novell, Inc.
dguest@novell.com

01 Jul 2003

This AppNote explores the issues involved in the management and governance of a large-scale Meta-Directory implementation. It outlines a tiered directory system structure that allows for both corporate-wide and local supportability for applications.

Since this AppNote was written in the U.K., it retains the British spelling used by the author.

Topics	Meta-Directory design, directory management, database administration
Products	none
Audience	network consultants
Level	intermediate
Prerequisite Skills	familiarity with distributed directory concepts
Operating System	n/a
Tools	none
Sample Code	no

Introduction

The recent growth in directories has seen the "Meta-Directory" take on a new, highly-visible role in the management of identity. This role can be seen as either a central location for storing common data, shared between multiple, disparate, systems; or as a location to hold meta-tags, indicating data locations for relevant data attributes about an identity. Each of these functions has a specific place in the corporate environment. They each have strengths and weaknesses which should be assessed before any decision is made as to which is best for a specific environment.

The core directory holding meta-tags can be a relatively small system; certainly a tag should be smaller than the data it represents. However, any application or system that makes a request to this directory must be able to understand, and make a forwarding request to, the meta-data returned by the system. As an alternative, the directory system must be able to make the onward request on behalf of the application and translate the data returned to a form that the application can understand. In this case, the directory is now a reference piece within a full application (or system) architecture. The growth of XML (Extensible Markup Language) as a transport medium may well ease the implementation of this form of technology in the future.

The data form of Meta-Directory is larger in scale, holding all of the relevant data internally. This simplifies the application architecture; it only has to look for, and understand information returned from, data in one location.

In both cases, data-whether meta-tags or real data-must be added to the Meta-Directory and be managed throughout its lifecycle. This management must not be limited to the meta-directory, but must touch on each of the other system directories (or databases) which hold identity information.

This management and governance is key to the ongoing success of a Meta-Directory implementation. This AppNote examines the management and governance of a data storage Meta-Directory; however, many of the principles discussed could be applied to meta-data directory systems as well.

Tiered Directory System Operation

Logically, a Meta-Directory is at the centre of a directory system. When viewed in a concentric model, as shown in Figure 1, the interfaces between the Meta-Directory and User Directories become obvious.

Figure 1: A simple concentric model of a directory system.

In this model, data flows into and out from the Meta-Directory to the User Directories. These could be directory technologies or user databases accessed for authentication by Application Systems.

The concentric model is relatively simple and does not include any details on data ownership. For example, if user data is "owned" by an HR system and flows into the Meta-Directory and is then passed onwards to other User Directories, the concentric circle model no longer applies.

For this reason, any Meta-Directory design exercise must include a review of the existing data locations and preferred information flows. This will produce a data model with several tiers of operation, as shown in Figure 2.

Figure 2: A tiered model of a directory system.

At the centre, Tier 1, is the Meta-Directory, along with the connectors to User Directories and other primary directories, such as HR Systems and Telephony Systems. These Tier 2 directories either feed data into the Meta-Directory or are fed data from the Meta-Directory. In some cases, they do both.

As can be seen in Figure 3, there is an expectation that specific application-based directories may be used to feed data into and out of more specific directory systems closer to specific applications.

Figure 3: Data flow between the Meta-Directory and application directories.

The governance of this tiered system can be delegated to departments or regions to some extent. At the centre, a fully corporate-wide governance structure should exist. This will have a direct role in the management of the Meta-Directory, and an indirect role in the governance of the User Directories. It will allow local management of a directory used within a specific department or location. This local control can cover schema management, but must not be allowed to modify any data held in the Meta-Directory-even that which is held locally for specific application use.

Within this tiered model, each directory contained within a department, or held for a specific application system, is seen as being within a local governance domain. This allows for each of these systems to be managed independently.

Infrastructure

Directories within a corporation must be split into specific areas: central and intermediate. The difference between these classifications can be seen in the function provided by the systems used within each classification.

Central systems are those that are used throughout the corporate structure or by a large percentage of the corporate population.

Intermediate systems are those required to complete specific functionality within a department, or any subset of the corporate population that requires access to a specific system function with its own administration (see Figure 4).

Figure 4: Detail of intermediate system infrastructure.

This intermediate system infrastructure allows for different governance, management, and administration to apply to directory systems dependent on their usage and the corporate availability of the system. Department-specific applications will continue to be managed within the specific unit where they are used.

As an example, an Oracle-based application would normally require dedicated Oracle Database Administrators (DBAs) to administer the system. These DBAs are found within the existing support structure and have an in-depth knowledge of the system. As such, they should be retained for support of the application within the existing environment.

Ownership in a Multi-Tiered System

In a configuration where a direct connection is to be made between the locally- managed directory and the core Meta-Directory, a number of governance issues are raised. Because the link is integral to the function of the system, changes to the locally-managed directory may affect the data flow into and out of the Meta- Directory. Changes in the actual data required-for example, a change to the format of an attribute-will require a change to be made to the master data held within the Meta-Directory. For this reason, the governance model dictates that governance of this system should be held at the corporate-wide level.

Figure 5 shows the potential ownership and governance of the Tier 1 and Tier 2 directories. Here, both Tier 1 and Tier 2 are owned and governed centrally. Each object-in this example, a User object-is formed of specific classes. These are common to all the Tier 1 and Tier 2 directories, and data flows correctly between them.

Figure 5: Potential ownership and governance of Tier 1 and Tier 2 directories.

Within the Tier 2 directory, an additional class can be seen: Functional OrgPerson. This class contains specific attributes that are used by local or functional applications. Because this directory interfaces directly to the Meta- Directory, attributes in this directory must be governed by the central governance body. This then fulfils two functions:

It ensures that each attribute is uniquely named and referenced.
It allows the central governance organisation to take an overall view of attributes that are being used across the company.

Should multiple directories request the same attribute to be stored locally, it can be added to the Meta-Directory. This ensures that data flow is consistent across the entire corporation. Where the data is seen as the same and is required across the organisation, it is available and replicated throughout the organisation.

Figure 6 shows the interaction between lower-level directories and the intermediate Tier 2 directories.

Figure 6: Interaction between intermediate Tier 2 and lower-level directories.

Here there is no requirement for interaction with the central governance for attribute changes to the lower-level directory. However, if a change is required within the Tier 2 directory, it must be advised and agreed upon by the central governance. Only changes that directly affect the Tier 2 directory must be advised and agreed upon by the central organisation.

Data Flow

In this model, a defined method of data flow must be specified and adhered to. Data within the directory model should flow between tiers, never across tiers. This ensures that communications are linear. Circulatory data sets should be avoided, as these cause support issues where a data source cannot easily be traced. This does not mean that data cannot flow bidirectionally between tiers; rather, it forces a structured approach with parent/child relationships between tiers.

With this configuration, some Tier 2 directories could be seen as pseudo- distribution directories to lower-level Tier 3 systems. For this reason, they are described as supplementary distribution directories. Common data for the lower-level systems is held within the Tier 2 system. This data may not be sourced, or even held, in the Tier 1 directory. An organisational agreement must exist between the corporate governance of the Meta-Directory and the ownership of the lower-level directories, insisting that data flow downward through the directories is always advised to the central governance. This will ensure adherence to the linear model for data flow.

This model then allows for the ongoing definition of the Tier 1 directory. The core system can be modified to include additional attributes as they are required by multiple lower-level systems. The definition and acceptance of these attributes is controlled by the central governance organisation and will be covered in the "Schema Governance" section later in this AppNote.

Administration

One of the advantages of maintaining support within each of the tiers is the in-depth knowledge available within the support structure for the functional system. This allows the ongoing administration to use existing tools and maintenance programs without affecting the operation of the Meta-Directory.

Within each locally-governed directory, original tools can be used to provide administration. Where required, manual administration operations can be maintained. This will provide additional levels of support that, at least in the early days of the Meta-Directory, cannot be provided automatically.

Data modified manually should not include any of the attributes delivered from Tier 1. However, if these are modified, the Meta-Directory structure and rules will force these attributes back to their master data content.

Master Data Locations

Within the Meta-Directory infrastructure, each data attribute must be defined within the directory schema. This will also detail the master data source for each attribute. Attributes within the Meta-Directory and its subordinates must have a single master source. This will allow data to be modified in its master location and then flow through the system. Any directory system with the same attribute defined will then show exactly the same content. Any data that is modified in a location other than the master source will be overwritten with the master data. This will ensure that data integrity is maintained at a high level.

In general, most of the administration of master data will be done by the data owner. In the case of users, the data will be considered to be owned by each individual user. The users will have access to the master source for their address, mobile telephone number, and other personal details. However, their e-mail addresses will not be available to change, as this data is owned by the e-mail system and is generated automatically as user accounts are created.

Various sources for data will be available across the system, ranging from HR data held within the HR department and telephony data held within the telephone exchange system, to application data held within each application. As data is modified in each of these locations, it flows into Tier 1, potentially being transformed, and then out to other subordinate locations.

Schema Governance

The complete Meta-Directory must be actively governed. However, the major portion of the governance will revolve around the schema of the Tier 1 directory. Here all the common attributes will be configured and the naming conventions and internal OID numbers allocated. These conventions will then be applied to all Tier 2 and lower directories where the same attributes are to be held.

As new attributes are identified for the lower directories, either because of the addition of a new application or an enhancement to an existing system, decisions will be made as to the requirement for data to flow into and out from the new attributes. If this data becomes apparent within the Tier 2 systems, its overall acceptance can be examined by the central governance body. Naming conventions and OID number, or other data definitions, will then be applied.

This structure ensures that only data which is relevant to corporate-wide systems is held centrally. As new attributes become widely accepted and can move from being functionally dependent to corporate-wide, they can be added to the Meta- Directory. Upon being added, the master data source is also identified and processes are put in place to guarantee the data quality.

Tiered System Governance Example

In order to fully examine the governance of the directory, an example of a tiered system becomes necessary. In this case, the Meta-Directory is in place with feeds to and from the HR systems. A department is now linking in to the system in order to provide for interoperability with its Sales systems.

The Sales systems consist of two separate applications, each with its own data sources. One, called "S Admin," is used for collating details of staff and their contacts; the other, "S Structure," is a combination of sales structures and corporate accounts. Data in these systems also includes details that can be sourced from the Meta-Directory, such as name, telephone number, title, and so on. This data can be fed directly from the Meta-Directory and is a benefit for integration with the Meta-Directory.

In order for the system to function correctly, data from the "S Admin" and "S Structure" systems must be combined. At present, this is a manual process involving administration of both systems, as shown in Figure 7.

Figure 7: The current system requires manual intervention to share data.

Under the new structure, these systems would be seen as being subordinate to the Meta-Directory, as shown in Figure 8.

Figure 8: The new structure provides for automatic data flow between systems.

This structure gives the data flow between the systems an opportunity to be transformed. Each system can then dynamically update the other.

In order for this structure to function, the Meta-Directory must hold details from both "S Admin" and "S Structure." Since these attributes are not used over the whole corporate structure, this adds complexity to the Meta-Directory without providing any corporate-wide benefit.

To provide this level of integration without involving the Meta-Directory, an additional layer of data management is added. This becomes an application directory with a direct connection to the Meta-Directory at Tier 1 and to the Sales systems at Tier 3. The structure would then logically appear as shown in Figure 9.

Figure 9: The revised structure adds a layer of data management.

Here, data flow from both Sales systems is gathered in an Application Directory. This directory is synchronised with the Meta-Directory and receives all the corporate data, which is then stored for onward synchronisation with the Sales system. Because there is now an additional level between the Sales systems and the Meta-Directory, there is no requirement for the Meta-Directory to hold Sales system-specific data. The data can now be held in the supplementary distribution directory without having any effect on the Meta-Directory structure.

In this case, the new Application Directory is owned and governed centrally. This is mandated by the direct link between it and the Meta-Directory, allowing the corporate governance to have visibility of the attributes shared between the Sales systems. Should any of these attributes be seen as benefiting the corporate structure, they can be added to the Meta-Directory with the master data source set as the Sales system.

Conclusion

This AppNote has outlined a tiered directory system structure that allows a complex corporation to manage a Meta-Directory system while retaining local supportability for applications and maintaining a persistent view of the identity of each user throughout the organisation.

* Originally published in Novell AppNotes

Disclaimer

The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.