Making a Much Improved eDirectory with Transitive Synchronization

Articles and Tips: article

Kevin Burnett
Senior Research Engineer
Novell AppNotes

01 Mar 2002

In last month's Directory Primer, we discussed the Transitive Vector, what it is, and how it is used. With this information under our belts, we are ready to move on to Transitive Synchronization.

In NDS versions prior to NetWare 5, the original synchronization scheme required all servers in a replica list to be able to synchronize with each server in the replica list. This required implicit communication with each server on the list by checking the current synchronization status and sending updates if a given server is not synchronized.

This communication produced quite a lot of network traffic, which has gradually increased as networks have grown. With the versions of NDS that shipped at the introduction of NetWare 5, the synchronization process was radically changed-enter Transitive Synchronization.

Transitive Synchronization Briefly Defined

Let's say that you have two servers, a source server and a destination server. With Transitive Synchronization the source server looks at the replica list on the target server's Transitive Vector. If the source server's Transitive Vector is more recent than the target server's vector, the source server needs to synchronize with the target server.

This action prevents a lot of network traffic, since only the servers whose replicas need updating are actually updated. Transitive Synchronization provides the following benefits:

Network traffic is reduced, since it only sends the needed changes.
Larger replica lists are easily handled because of reduced synchronization traffic.
Supports IP directly, whereas previous synchronization strategies had to support IP through intermediary servers.

How Transitive Synchronization Works

The following steps describe the Transitive Synchronization process and how it works.

The Replica Synchronization Process is scheduled. This occurs any time a NDS object or object property is modified.
The Transitive Synchronization Process determines which servers hold replicas that need to be synchronized. Each time the Replica Synchronization Process begins its scheduled run, it first checks the entries in the Transitive Vector to determine which other servers hold replicas that need to be synchronized.

Note: The synchronization process determines which servers need to be synchronized by comparing the timestamps between the Transitive Vectors of the source server that received the update and the destination server. If the timestamp is greater for the source server, the replica updates are transferred. The source server does not request the target server's timestamps, since they are already present in the Transitive Vector that is stored on the source server.
Data updates are transferred from the source server to the target server. This occurs after determining which servers hold replicas that need to be synchronized.
The Transitive Vector is sent from the source server to the target server. After all the replica updates are sent, the source server sends its Transitive Vector information, which is then merged into those of the target server.

Examples

The remainder of this discussion will feature two examples that show how Transitive Synchronization works. For the first example, let's say you have three servers: server A, server B, and server C. Let's also say that the following is true:

Server	Contains Replica
Server A	Replica A
Server B	Replica B
Server C	Replica C

The network is setup as illustrated in Figure 1.

Three server/three replica network

Each server is able to communicate with the other two servers and all three servers are using the IP protocol (same-protocol network). Remember from our discussion last month about Transitive Vectors, each server in the replica ring holds a Transitive Vector with timestamps showing when the last updates were made on all of the replicas.

The following tables represent the Transitive Vectors on each server. To make this easier to understand, the timestamps are abbreviated as Time( n ), where ( n ) represents so many second in time. Also, note that all replicas are starting out at Time(0). In this example, a change is made to replica 1 on server A, which schedules the replica synchronization process. The Transitive Synchronization process then determines which replicas need to be updated.

When Time = Time(10) , since an update to the replica has occurred, file server A will update the timestamp of its replica (replica A) in the server A time vector. These changes are shown in bold below.

Note: The only timestamp values a server changes are those in its own time vector. It will never change the values of the other replicas time vectors.

Server A's Transitive Vector at Time=time(10)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(10)	Server A	Time(10)	Time(0)	Time(0)
Time(0)	Server B	Time(0)	Time(0)	Time(0)
Time(0)	Server C	Time(0)	Time(0)	Time(0)

Server B's Transitive Vector at Time=time(10)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(0)	Server A	Time(0)	Time(0)	Time(0)
Time(0)	Server B	Time(0)	Time(0)	Time(0)
Time(0)	Server C	Time(0)	Time(0)	Time(0)

Server C's Transitive Vector at Time=time(10)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(0)	Server A	Time(0)	Time(0)	Time(0)
Time(0)	Server B	Time(0)	Time(0)	Time(0)
Time(0)	Server C	Time(0)	Time(0)	Time(0)

When Time = time(11) , server A synchronizes the changes to Server B. It sends a copy of its Transitive Vector to server B. Server B copies Server A's time vector. Server A then changes it's own vector to reflect that Replica A is at time(10). Lastly, Server A updates its timestamp of Replica B to show that the last changes were made at time(11).

Server A's Transitive Vector at Time = time(11)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(10)	Server A	Time(10)	Time(0)	Time(0)
Time(0)	Server B	Time(0)	Time(0)	Time(0)
Time(0)	Server C	Time(0)	Time(0)	Time(0)

Server B's Transitive Vector at Time = time(11)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(0)	Server A	Time(0)	Time(0)	Time(0)
Time(0)	Server B	Time(0)	Time(0)	Time(0)
Time(0)	Server C	Time(0)	Time(0)	Time(0)

Server C's Transitive Vector at Time = time(11)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(0)	Server A	Time(0)	Time(0)	Time(0)
Time(0)	Server B	Time(0)	Time(0)	Time(0)
Time(0)	Server C	Time(0)	Time(0)	Time(0)

When time = time(12) , Server B synchronizes the changes to Server C. Server B sends its versions of time vectors for Server A and Server B, since they are newer than those on Server C. Server C copies those time vectors exactly as they were sent. Then it updates the timestamps in its own time vector with the most recent information from time vectors for Server A and Server B. Lastly, it updates its own timestamp of Server C to show the latest changes where made at time(12). It would have been possible for Server A to have sent these changes to Server C. With Transitive Synchronization, it really doesn't matter which server sends the changes.

Server A's Transitive Vector at Time = time(12)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(10)	Server A	Time(10)	Time(0)	Time(0)
Time(0)	Server B	Time(0)	Time(0)	Time(0)
Time(0)	Server C	Time(0)	Time(0)	Time(0)

Server B's Transitive Vector Time = time(12)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(0)	Server A	Time(0)	Time(0)	Time(0)
Time(0)	Server B	Time(0)	Time(0)	Time(0)
Time(0)	Server C	Time(0)	Time(0)	Time(0)

Server C's Transitive Vector Time = time(12)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(10)	Server A	Time(10)	Time(0)	Time(0)
Time(11)	Server B	Time(10)	Time(11)	Time(0)
Time(12)	Server C	Time(10)	Time(11)	Time(12)

When time = time(13) , Server C gets ready to synchronize with Server A. It compares the timestamps in its own Time Vector with the values of the other time vectors. It sends the Time Vectors for Server B and Server C because the copy Server A has is older than Server C. The Time Vector for Server A is not sent because the most recent changes Server C has are at time(10), and Server A is current as of time(10). Server A receives the updates from Server C. It then copies the Time Vectors for Server B and Server C and then updates its own Time Vector with the most current information.

Server A's Transitive Vector Time = time(13)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(13)	Server A	Time(13)	Time(11)	Time(12)
Time(11)	Server B	Time(10)	Time(11)	Time(0)
Time(0)	Server C	Time(10)	Time(11)	Time(12)

Server B's Transitive Vector Time = time(13)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(0)	Server A	Time(0)	Time(0)	Time(0)
Time(0)	Server B	Time(0)	Time(0)	Time(0)
Time(0)	Server C	Time(0)	Time(0)	Time(0)

Server C's Transitive Vector Time = time(13)

Time Vector Timestamp	Server	Replica A	Replica B	Replica C
Time(10)	Server A	Time(10)	Time(0)	Time(0)
Time(11)	Server B	Time(10)	Time(11)	Time(0)
Time(12)	Server C	Time(10)	Time(11)	Time(12)

The previous example shows how one update would be propagated through a replica ring. It has been simplified to show only one change at a time. In reality, many updates will typically occur at the same time.

For the second example, let's assume we have a mixed-protocol network. In the previous example, it would be good to note that it didn't matter who sent the updates. When Server B received an update from Server A, Server B sent the same update to Server C.

In pre-NetWare 5 versions of NDS, any changes to a replica had to be sent from that server to all of the other server's in the replica ring. Hence Transitive Synchronization adds the ability for synchronization to occur in rings where every server is not able to communicate directly with every other server.

Let's look at a situation where a server that supports only IP is not able to communicate with a server that supports only IPX. However, if a server in the replica ring speaks both IPX and IP, that server can act as a mediator. The only requirement in a mixed IP and IPX network is that one server in the replica list support both protocols. Figure 2 illustrates the process of synchronizing through a mediator.

Synchronizing Through A Mediator

In Figure 2, Server A is unable to synchronize directly with Server C because of the different protocols. But, both Server A and Server C can synchronize with Server B, since Server B supports both protocols. In this case, Server B is called the mediator.

For example, Server A sends its updates to Server B. Server B sends these updates to Server C and then updates its own Transitive Vector to indicate that Server C has received the updates from Server A. The next time that Server A and server B synchronize, Server A checks Server B's Transitive Vector and then updates its own Transitive Vector to indicate that Server C has been synchronized. In this way, Server C receives updates from Server A without ever talking directly with Server A.

One side-effect of this type of replica ring is that Server A will attempt to talk directly to Server C. Server A will receive "Unable to communicate with Server C" errors. This does not indicate a problem with NDS, it just shows that NDS has detected a situation that is really not a problem.

I would like to credit the eDirectory Core Development Team and Novell Education Group for providing critical information for the success of this column.

* Originally published in Novell AppNotes

Disclaimer

The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.