An Introduction to Novell's DirXML
Articles and Tips: article
Technical Editor
DeveloperNet University
mmckell@novell.com
01 Jul 2000
This AppNote offers an introduction to Novell's DirXML technology, starting with brief descriptions of XML and NDS eDirectory and how they fit together. It then provides details on the DirXML architecture and offers a few examples of how this technology will improve the way you do business.
- Introduction
- XML and the Directory
- DirXML Architecture
- DirXML Components
- DirXML Rules and Transformations
- Association Examples
- Summary
Introduction
In any organization, various databases and applications are used to store diverse, company-related information. These diverse databases are likely disconnected as a corporate whole and isolated within singular departments with divided ownership. This means that within the same organization there are likely multiple databases that contain much of the same information, arranged within different data definitions. Each time a new application or database is added to the enterprise, so is a new set of data definitions.
The main problem created by adding these new applications and databases is that each application uses the same, redundant information. For example, an existing e-mail system contains information about employees who use the system. When a new payroll system is added to the enterprise, these same employees are added to the new system with potentially redundant information. When that redundant information changes, each database that uses it also needs to be changed, resulting in expensive maintenance costs and greater potential for error.
DirXML, based on Novell Directory Services (NDS) eDirectory software and industry standard eXtensible Markup Language (XML), is a technology that enables enterprise management of data, with the goal of dramatically reducing the costs of managing different databases and directories. DirXML is also a key component of DENIM, Novell's Directory Enabled Net Infrastructure Model, because it not only integrates with the directory and the various platforms on which it natively runs, but it also leverages the powerful capabilities of the directory.
This AppNote offers a brief explanation of the XML and NDS eDirectory technologies as they relate to DirXML, describes its architecture, and provides several examples of how it might be used in today's computing environment. For more information about DirXML, visit:
http://www.novell.com/nds/dirxml/
XML and the Directory
Recently, technologies that have been around for many years have been adapted to meet the demands of the Internet. SGML (Standard Generalized Markup Language), which has been around for 20 years is the parent of both HTML and XML, and we can all readily recognize the value that HTML has given the Internet. Directory technologies, which have traditionally offered improved network management and security, are moving outside firewalls to offer the same capabilities for Internet use as well. XML and NDS eDirectory are excellent technologies that offer, when combined, tremendous benefits.
XML
XML is a technology, or rather a technology specification, that is used to structurally describe data. Like its markup brother, HTML, XML data is contained within tags, except with XML there are no predefined tags, you must define your own. Because of its structural description of data, XML allows that data to easily pass between different applications. Some applications have a proprietary data format, others use open standards. XML can act as a bridge between all these different data formats to ensure that data is understood across the entire enterprise.
XML is extremely flexible and allows for the data to represent structure and meaning without specifying how the data needs to be rendered. As an example, consider using XML to describe a person. To do this, we'll create a tagged data element called "person", and then within that element describe sub-elements which enrich the description of the person. Figure 1 shows the set of defined attributes for a sample person.
Figure 1: An example of using XML to describe a set of data.
<person> <name>Lars Blattman</name> <sex>male</sex> <age>33</age> <status>married, one child</status> <employment>environmental conservationist </employment> <hobbies>curling and les boules</hobbies> <height>6' 0"</height> <weight>175 lbs.</weight> </person>
Notice the similarities between XML and other markup languages. Although tags are used to describe the data elements, the tags are arbitrary. Any label could be used. The power of XML is its ability to define tags and then use the tags to organize, define, and structure the data. This flexible method of organizing data makes XML technologies the perfect integration technology, because it can inherently conform to any environment and any data model. XML is also easy to read and understand and completely non-proprietary.
XML's non-proprietary nature facilitates many things, especially being able to share its data across various platforms and applications. Applications can share XML data even if the rendering or interpretation of the data is different. XML has the flexibility to be extended, meaning that one application can take a set of XML data and extend it with additional attributes. Because of its flexibility, XML is becoming the lingua franca of the Internet that allows all devices, platforms, and applications to share data.
Directory Services
Directories provide storage, management, security, and distribution of data. Within a directory, data is stored in the form of objects that are managed in a hierarchical namespace. Objects can include individual users, groups of users, printers, network drives, and many others. Directories are proficient at storing this information and then making it available to consumers: applications and users.
One of the characteristics of an efficient directory is the ability to distribute the data and to replicate or provide copies of the data to multiple locations. This provides a means for applications to access data local to their environment without having to cross WAN links, which are expensive and cumbersome. This replication is effective because it provides a single system image of all network resources. Using objects, directories can define all the elements of the network and how these objects relate to each other. Since the directory has a view of the entire network, it is also very good at controlling access to the elements within the network. Enforcing authentication and then imposing access controls to the authenticated identity, directories can control, not only the management of the data, but who has access to it.
For a directory service to be successful in today's computing environment, it must possess several key characteristics:
Cross-Platform. Networks are heterogeneous. Different operating systems and hardware platforms offer different performance characteristics and host various applications, and a directory must be able to work across all relevant platforms.
Highly Scalable. Since the network contains millions, if not billions of entities or objects, the directory service must be capable of storing information about each entity without limits.
Open Standards-Based. A directory service must support industry standards so applications can easily access the data. The most important of these is the LDAP (Lightweight Directory Access Protocol) standard. This is the access method that Internet applications use to access directory information.
Reliable. Because the directory stores important authentication and access control information, it must be completely reliable. The distributed and redundant nature of NDS eDirectory makes it fault tolerant and accessible, even if part of the network becomes inaccessible.
Secure. The strongest security must be enforced by the directory so that data stored within it is protected from unauthorized use or access.
DirXML
Together, XML and NDS eDirectory form DirXML. So where does this combination product, DirXML, fit relative to Novell's other products?
DirXML is a directory-enabled application that sits on top of NDS eDirectory. DirXML is peer technology to solutions like NDS Corporate Edition and Novell eGuide, all of which leverage the foundational technology of NDS eDirectory. NDS eDirectory provides the means for storing and distributing directory data. NDS eDirectory also provides a notification when a change to the stored data occurs. DirXML provides a means for interfacing with eDirectory data and surfacing the data and change events through an XML interface. This essentially provides a means for exposing the valuable directory data to other applications, using XML as the vehicle. DirXML is built to extend the functionality of the directory to the application in a non-intrusive way.
NDS integrates many systems, directories, and applications. NDS supports all major access protocols and security standards. Developers can use LDAP, ADSI, or Java APIs to access NDS eDirectory content. Hundreds of applications now use NDS as the key point of management, meaning that NDS controls the policies that drive application functionality.
Novell has taken a unique approach to accomplish this integration. Rather than require application developers to retrofit their applications, to provide an LDAP stack, or to write proprietary scripts, Novell has decided to take the directory to the application with DirXML. The application needs no modification to integrate with NDS. Developers can create integrated solutions using open standards like XML and LDAP without even modifying their application code. This solution makes it easy for both users and developers to take advantage of the powerful NDS eDirectory features."
DirXML, in conjunction with NDS eDirectory, accomplishes the following important tasks:
DirXML uses NDS replication events to surface changes.
Data management can be centralized or distributed and DirXML addresses the challenge of pulling all the data together.
Directory data is exposed in XML format allowing it to be consumed and shared by XML applications or applications integrated through DirXML.
The flow of data is controlled by specific filters that govern data elements defined in the system. Using filters, authoritative data sources can be enforced.
Directory data in XML format can have rules applied to it. These rules govern the interpretation and transformation of the data as changes flow through the DirXML engine.
The data can be transformed from XML into virtually any data format. This allows DirXML to share data with virtually any application.
Because data and the movement of data is governed by XML, the integration drivers are flexible to accommodate any need.
Associations between NDS objects and all other integrated systems are carefully maintained to ensure that data changes are always reflected across all systems.
DirXML Architecture
The goal of the DirXML technology to provide clean movement of data between NDS eDirectory and any application, directory, or database that requires directory information. To accomplish this, DirXML has a well-defined interface that takes NDS data and events and translates them into XML format. This interface allows the data to flow in and out of NDS in a bidirectional manner (see Figure 2).
Figure 2: DirXML acts as an engine to pass directory information from NDS to subscribing applications.
The data that flows between NDS and the target application is managed and processed by Subscriber and Publisher channels. These are the means of linking event systems (or changes in the data) of both NDS eDirectory and the target application together so that data flow is determined by its dynamic characteristics. The Subscriber and Publisher channels act as filters for which events are received from NDS and which are published to NDS. These filters also determine which object classes and attributes are accepted into the channels, and they are actually configured through NDS, giving you greater control of how the data flows throughout the enterprise.
The Subscriber and Publisher channels communicate with NDS eDirectory through an Application Shim. An Application Shim is an application-specific piece of code, which knows how to communicate with the target application. The shim passes data to the application once the data has been processed by DirXML and receives data from the application that is translated before it goes to NDS eDirectory.
The DirXML engine is a collection of interfaces and data manipulation technologies that allow even disparate data systems to connect and share data. The DirXML engine provides an interface to NDS that exposes NDS data and NDS events using an XML format. The DirXML engine employs a rules processor and a data transformation engine to manipulate the data as it flows between two systems (see Figure 3). It also provides an interface for application-specific programs to attach through the DirXML engine to capture the data streams that flow from NDS.
Figure 3: The rules processor portion of the DirXML engine controls the flow of data between NDS eDirectory and other applications.
The NDS eDirectory Piece
NDS eDirectory is a directory service that manages data between multiple servers. When changes occur to the data, all other servers which hold copies of that data are informed of the change and make corresponding updates.
When DirXML is loaded on top of NDS eDirectory, these events are passed between NDS and the DirXML engine. NDS manages the drivers that are loaded and configured to consume these events. NDS also provides some basic management services related to the events themselves so that events are always stored until they've been successfully consumed.
The DirXML engine converts the NDS events into XML documents (in DOM format). Rules are applied to the XML document and data transformations are performed to convert the data into the target application's native data format. Once the XML document has been completely processed, the event data is handed to an Application Shim which delivers the formatted data to the target application. We'll explore each of these steps in detail:
DirXML is represented by objects defined in NDS. The base NDS schema has been extended to accommodate this information. The new object types are:
Driver Set. An object that defines a collection of drivers.
Driver. An object that defines all the components of a driver including rules, style sheets, and the Application Shim.
Rule. An XML document that defines a rule that will be applied to the NDS event stream as it flows through the DirXML engine. Examples of these rules are Event Rule, Schema Mapping Rule, Matching Rule, Create Rule, and Placement Rule. These will be discussed in more detail below.
Style Sheets. An XSLT document that defines a transformation of the NDS XML code into some other data format.
Application Shim. Although not specifically represented as an object in NDS, the Application Shim is worth noting as it completes the driver definition. The Application Shim is an executable piece of code that the DirXML engine uses to interface with the target application. This is actually stored as an attribute of the driver objects stored in NDS.
In general, the DirXML engine controls the flow of data between NDS and the target application. NDS provides the engine for storing the configuration and the event system that drives the entire process. The DirXML technology includes all the components necessary to move data between NDS and any given application.
DirXML Components
To begin a description of the DirXML components, let's assume NDS eDirectory is running within a network and that an application storing user data is also contained in NDS, but that it maintains its own data, separate from NDS eDirectory. You want to connect these together so your organization only has one set of information to maintain. You do this by placing the DirXML engine between the two systems.
Publish and Subscribe Channels
Once the DirXML engine is in place, the Publisher and Subscriber channels (the way NDS and the connected application share data) can be defined. These channels use filters to define which events flow through the DirXML engine in either direction. The filters themselves are configured through NDS which means that strict administrative control--based on directory policies--is enforced when determining how the data flows.
When the DirXML driver is loaded, the driver subscribes to NDS events. When events occur, the data is handed to the driver for processing. Likewise, the driver publishes events back to NDS as data changes in the connected application. For both the Publisher channel and the Subscriber channel, specific filters are applied so that only events which are relevant are processed.
From a business perspective, this allows the administrator to constrain the way data flows from one system to another. It is possible, using these filters to designate that events can be subscribed to by one system, while restricting that same system from publishing events back. This creates a one-way synchronization solution.
If no filter is defined for the channels, then by default, no data events will be exposed to either system. Such filter definition allows the administrator to designate, at a very granular level, the kind of data that is exposed. Each object class can be selected or deselected. Within each object class, the specific attributes of that class can be selected or deselected, giving administrators a very finely grained mechanism for controlling the flow of data.
Data Representation
For a specific driver, you must define rules and style sheets that accomplish the processing and transformation of the data. This ensures that on either side of the DirXML engine, the data is represented in the correct format. On the NDS side, the data is in an NDS-formatted XML. On the application side, the data is represented in the application's native format.
DirXML Event Cache
All of the events generated through NDS are stored in an event queue until they are successfully processed. This guarantees that no data will be lost due to a bad connection, loss of system resources, unavailability of a driver, or any other network failure.
Association Table
In DirXML, associations refer to the matching of objects in NDS with objects residing in connected systems. When DirXML is initially installed, the NDS schema is extended. Part of this extension is a new attribute tied to the user object. This attribute is an association table. Association tables keep track of all the objects that an NDS object is linked to. This table is built and maintained programmatically so there is never a reason to edit this information manually, although it is often helpful to view this information.
Rules
When an event occurs that effects an NDS object, there are several rules that can be invoked to ensure that only appropriate associations are created. These rules fall into the class of rules called Object Mapping Rules. This set of rules includes the Matching Rule, the Create Rule, and the Placement Rule.
The creation of an association between two objects happens when an event occurs to an object that has not yet been associated with another object in the network. For an association to be created, the minimum set of definable criteria must match between each object. If this criteria is met, then an association will be created. The Matching Rule defines the criteria for determining if two objects are the same. If no match is found for the changed object, then a new object is created. For this to occur, all of the minimum creation criteria must be met. This criteria is defined by the Create Rule. If all criteria are met, then the object is created. Another rule, the Placement Rule, defines where, in the naming hierarchy, the new object is created.
Associations can be created in one of two ways, a match between objects or a new creation of an object in a specific location. Once the association between objects if formed, this association remains in effect until the objects are deleted.
Authoritative Data Sources
Designating and maintaining authoritative sources for data is often a difficult task. When data begins to flow across multiple systems, there is often political or policy-driven issues related to how the data is shared. Using the Publisher and Subscriber filters in DirXML, administrators can constrain how the data flows and ultimately preserve the data authorities.
Let's consider an example of two applications connected to NDS eDirectory through DirXML: a Human Resources application and an IS&T application. The Human Resources driver is configured with only a Publisher filter. This means that the Human Resources application can publish data into NDS eDirectory, but that it doesn't subscribe to any changes originating with NDS. Likewise, the IS&T application has subscribed to events from NDS eDirectory, but hasn't defined a Publisher filter. This means the IS&T applications will accept changes that come from NDS eDirectory, but won't ever send any changes back.
In this example, let's say that the filters defined are for the object class "User" and the attribute "Phone Number". Associations for the user are established in NDS eDirectory and the Publisher and Subscriber filters are set for the driver. This makes the Human Resources application the authoritative source for "Phone Number" information. All changes must originate there and are then propagated through the system. If changes to "Phone Number" occur elsewhere, the changes won't be propagated.
Application Shim
The application shim is the piece of technology that knows how to communicate with the target application. In its simplest form, the application shim is an API translator. It takes code from DirXML and pushes it through the native APIs into the target application.
The application shim assumes that data will be received in the native format for the connected application. If the application doesn't support XML directly, then the XSL transformation engine converts the XML code into the Application Native Format (ANF). Conversely, when data is passed out of the application shim, the data will be translated back into XML for processing in the DirXML engine.
The beauty of the application shim is that it allows DirXML to communicate with existing databases, applications, and directories in a completely non-intrusive way. The applications themselves don't have to change to be able to synchronize data with NDS. Any programmatic interface is sufficient.
The conversion of NDS information into the ANF includes several steps:
The schema between NDS and the target application are mapped. This mapping describes how the schema in NDS is to be interpreted by the target application so that the data means the same thing to each application.
The actual objects are mapped. This mapping is the contents of the association table tied to each NDS object. This table tells the DirXML driver which object, in the target application, the NDS object maps to. Then when a change occurs, the appropriate object can be immediately updated.
Data represented in the NDS XML format may not be suitable for the target application. If this is the case, the data needs to be transformed using the XSL processor.
An event transformation. This step takes one event and turns it into the appropriate corresponding event. For example, if I "delete" an object in NDS, that may translate into a "remove" event in my target application.
DirXML Rules and Transformations
The process of manipulating the data in a highly configurable way gives DirXML a huge technical advantage over competitive products. Expressing the data in XML is the key. Once the data is represented in XML, transforming it and manipulating it becomes much easier. The class of rules which define data transformations of the NDS event stream are called Transformational Rules. This class of rules includes the Schema Mapping Rule, the Event Transformation Rule, and the application of XSLT style sheets for data transformation. For applications, this means that synchronizing data with the enterprise can be accomplished very easily.
Event Transformations - Event Rule
Event transformations are very similar to data transformations. As events pass through the Subscriber or Publisher channel, they're expressed in XML format. Using XML transformations these events can become essentially a different class of event. For example, a "create" event in NDS might actually be rendered as an "add" event. Likewise, a "delete" event in one system might be equivalent to an "archive" event in another system. DirXML provides absolute flexibility in determining what events in each of the connected systems really means.
Schema Mapping - Mapping Rule
Another key problem when integrating different applications is the problem of having different schemas. This is true even with applications which have been deployed on a common technology and/or protocol. A good example of this is LDAP. LDAP applications will likely have different schema because schema isn't tied to technology, it's tied to the business solution it implements.
Schema mapping in DirXML is easily accomplished. The NDS schema is read from NDS eDirectory. The DirXML driver supporting a given target application schema is responsible for supplying the DirXML with an updated view of the existing schema. Once the two schemas have been identified, then a simple mapping can be created between NDS and the target application.
Schema mapping rules are defined in XML and stored in NDS. When data needs to be mapped to fields in the target system, NDS accomplishes this task using the mapping rules. Once a schema mapping is defined in the DirXML driver configuration, the corresponding data can be mapped. Often data in one system is represented differently in another system, even if the data is the same. The following illustrates an example of schema mappings between different systems.
Application |
IS&T Application |
NDS eDirectory |
Human Resources Application |
Data Representation |
DOB |
Birth |
Birthdate |
Likewise, there are different data representations for each one of these systems.
Application |
IS&T Application |
NDS eDirectory |
Human Resources Application |
Data Representation |
February 9, 1973 |
2/9/1973 |
9-2-1973 |
Although the data is the same, it is represented differently. The DirXML schema mapping and transformation rules accomplish this mapping and transformation.
Object Mapping - Matching Rule
When objects from different applications are associated, it implies that there is a correlation between the objects in the different applications. DirXML allows the flexibility of defining what this correlation is. Using a matching rule, DirXML can dynamically assign associations that link objects in different applications.
The matching rules essentially define the minimum criteria that two objects must meet to be considered "the same". For example, if there were two user objects, one in each of two applications, then a matching rule might state that the objects could only be associated when the "name", "phone number", "e-mail address", and "post address" contained the same data for each object. If this criteria is met, then the objects become associated. Otherwise a new object is created. This provides a means of adding intelligence to the DirXML processes. DirXML is able to detect when two objects are logically the same, but can also automatically create new associations if the minimum standard is not met.
Object Creation - Create Rule
Object mapping is another way of describing associations. When objects are associated, they are considered to be mapped. It is important to note that DirXML doesn't require a common "key" to map two objects. Since an association table is maintained for each NDS object, entries representing the corresponding objects are completely arbitrary. Mapping is completely separate from the hierarchy of either system; therefore, each directory system can preserve its unique hierarchy regardless of how extensively it is integrated, through DirXML, with other applications.
The create rule manages an object creation process. It's very important that DirXML doesn't populate other directories if the data is considered to be malformed or incomplete. Therefore, the create rule provides a complete definition of what a well-formed object represents.
Let's take an example of a user object being created in an e-mail application. This creation is mirrored in the NDS tree, but because a create rule specifies otherwise, the addition is not immediately reflected in the payroll application. This is because the rule specifies that only User objects with a complete definition are allowed.
In NDS, the association for the object in the payroll application is put into a pending state until all the criteria is met. As new pieces of data are added to the object definition, eventually a complete object is formed and the create event is allowed to proceed. Once this happens, the object association is set to active.
Hierarchy Preservation - Placement Rule
The object placement rule defines the criteria for the placement of the newly created object. These criteria can be based on class, attribute, or path. In other words, rules can be defined which determine where, in a connected application, the new object is to be created. This allows for different hierarchies within each connected application, or the absence of hierarchy altogether.
Association Examples
Earlier we discussed associations between NDS eDirectory and other applications, databases, or directories. Associations are created and leveraged through the DirXML engine's Subscriber and Publisher channels. Following is a closer look at both channels.
Subscriber Channel
Remember, the Subscriber channel is used by drivers to get information out of NDS. For a given event, an association will be used to determine how the event is processed. If an association doesn't exist, then a new object will have to be created, or an association formed with an existing object.
The DirXML engine will first look for an object that matches, based on the contents of the matching rule. If no match is found then an object will be created assuming that all the criteria for creating a new object has been met. If an object is created, then the contents of the placement rule will determine where this new object will reside in the target application.
Publisher Channel
Building associations through the Publisher channel works similarly. NDS will be checked to determine if the object already has an association. This is quickly and easily accomplished because all the associations are indexed and reverse indexed to make them easy to search.
If no object exists, then the match rule will be applied to try to find a match. If multiple matches are found then DirXML flags an error. If a single match is found then an association is created. If no match if found then a new object will be created in NDS.
The create rule will determine if all the minimum attributes are satisfied for the create event. If they are, then a new NDS object is created. It is created in a location in the NDS tree according to the location defined in the placement rule.
Once all objects are created and matched, then the association table is updated and the data is synchronized.
Summary
DirXML covers all the important aspects of meta-directory requirements. Additionally, it covers these requirements in a way that makes the system easy to extend and customize. Because the data is expressed using XML, the data and events can be easily shared between disparate systems and modified to meet the individual needs of each.
DirXML doesn't enforce a hierarchy, or a common naming scheme. DirXML allows for disparate schemas, and even completely different data representations.
Coupled with a development environment that allows developers to easily create new DirXML drivers, DirXML is the most robust and flexible solution for synchronizing applications inside and outside of the enterprise.
DirXML is all about simplifying the sharing of data between applications using eDirectory. DirXML non-intrusively brings the power of NDS to the application without requiring specific integration work to be done. Using XML as the data representation format, data can flow easily between applications while DirXML applies the policies and transformations that make the exchange possible.
Common data is stored in NDS eDirectory while application-specific data is kept with the application. DirXML helps to solve the problem of enterprise data management by isolating the set of common data and controlling its use through profiles stored in the directory.
* Originally published in Novell AppNotes
Disclaimer
The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.