Implementing an Enterprise-Wide White Pages/Yellow Pages Lookup Service with NDS eDirectory

Articles and Tips: article

Phyllis Morris
Consultant
Novell Consulting, Atlanta
pmorris@novell.com

01 Jun 2000

Most businesses have a great deal of data from many sources about their employees, departments, products, and services. The problem is that there is no common interface for locating this information. In such situations, the obvious solution is to consolidate the pertinent information into a single directory and then implement some type of lookup application to simplify access to the data.

This AppNote describes a project undertaken by XYZ Company to implement an enterprise-wide information lookup service. (The name of the company has been fictionalized to safeguard privacy.) The impetus for this project was the realization that the company's existing lookup systems were antiquated and woefully inadequate. Business opportunities were being missed on a daily basis because employees could not quickly and accurately locate information in order to transfer a customer to the appropriate resource.

After a detailed needs analysis and a rigorous evaluation of potential components, XYZ Company chose Novell's NDS eDirectory as the basis for their solution, along with products from Oblix. Among other criteria, these components met the requirement to interface with several Lightweight Directory Access Protocol (LDAP) v3-compliant directories that already existed at the company.

This AppNote starts with a brief overview of the project, and then details the processes that XYZ Company went through in determining the application requirements, identifying the existing sources of information, and selecting the technologies and products for implementation. It also describes a series of tests that were conducted to determine the most appropriate choice for the back-end directory.

For more information about NDS eDirectory, visit the product home page at:

http://www.novell.com/products/nds/

Project Overview
Determining the Application Requirements
Determining Existing Information Sources
The Selection Process
Conclusion

Project Overview

XYZ Company had a vision of consolidating pertinent information into one cohesive directory that would serve as the central source of information (data repository) for all 12,000 employees in the company. The first step in realizing this vision was assembling a Project Team to research and implement a solution.

The Project Team decided to model their solution after a paper telephone directory consisting of both White Pages and Yellow Pages. That is, data pertaining to individual employees would make up the "White Pages" portion of the service, while product and service information (which ties into business units or departments) would form the "Yellow Pages." Both the White Pages and Yellow Pages would be combined into one application, which was to be accessed from a single interface.

The White Pages portion of the project seemed relatively straightforward, as it was simply a replacement for several existing systems. However, the Yellow Pages would be an entirely new application that would collect XYZ Company's hundreds of business offerings (products and services) in a searchable format that any employee could understand and navigate through.

As is so often the case in projects such as this, the team recognized that success would require a combination of technology implementation and the changing of business processes and procedures. It would serve no purpose to throw more technology at the problem, nor would it do any good to reshape existing business processes without implementing any new technology.

After this initial analysis, the Project Team outlined a three-step approach for the project:

Determine what was actually required of the application
Identify what data was available to search and display
Choose a front-end application and a back-end directory

Determining the Application Requirements

Since the lookup service was to be a new Web-based application, the Project Team began by examining existing processes of looking up information (individual employee, product or service, and departmental). Areas of analysis included:

How were customer calls taken and what types of questions were asked in order to route them?
What fields did the users need to "see" on the screen?
How were searches to be performed?
What fields needed to be included in the search criteria?
Which features would be absolutely necessary?
Which features would be nice, but not required?

Other initial project requirements were identified as follows:

Employees would access the corporate directory from a standard Web browser.
The back-end data repository must be capable of storing hundreds of thousands of records, which must be quickly retrievable.
Directory updates should occur regularly from the main data sources, ideally in real time.
A means must be provided to add more data fields quickly and easily.

Questionnaires

As part of the discovery process, the Project Team created several questionnaires to help them define the application's functional requirements, define the data required for the application, and determine where data sources may be lacking. These questionnaires were geared to specific job duties that the application needed to address. The population questioned included all levels of staff: operators, support staff, executive staff, end-users, and so on.

Sample questionnaires are provided here for reference.

Operator Questionnaire

#	Question	Response
1	What are your current job responsibilities?
2	How do you currently perform your job?
3	What percentage of your time is spent on calls?
4	What percentage of your time is spent on work other than calls?
5	Describe in detail how you currently do your job.
6	What is the most common type of call you take?
7	How long does it take you to answer a call now?
8	Rank the percentage of call types you receive.
9	What types of information does a caller typically have?
10	How is XYZ Company employee information handled differently than contractor information?
11	What is the source or sources of information for contractor data?
12	How do you inform the caller of a person's location in order to dial 9 or not?
13	What is the most difficult aspect of your job?
14	Where is the process difficult?
15	What is the most difficult thing to find out for a caller?
16	What makes these calls difficult?
17	Rank on a scale of 1 -10 the difficulty level of calls you take.
18	What key information fields are needed to perform your job?
19	What information sources do you currently use?
20	Programs?
21	Paper lists?
22	People?
23	What other sources?
24	How frequently are these sources updated?
25	Where do you go when all else fails?
26	Can you sort or search through the data to answer caller questions?
27	How could the sort or search features be better so you can find your answers faster?
28	How many systems do you now use to answer all call questions?
29	If the information in these systems is wrong what do you do?
30	How do you change the information in these systems?
31	What is the main reason that customer calls are misdirected?
32	What do you do if the customer calls back because the information in the system is wrong?
33	What do you do if the customer call was routed incorrectly?
34	What problems would you like to see eliminated from the current process?
35	If you were to design a new application or system what would you like to see the new application do?
36	What views or information would you like to see displayed?
37	How would that help you perform your job better?
38	What would you change if you could?
39	When a merger occurs, how are you made aware of new information?
40	What types of information are you given?
41	How is that information updated?
42	How difficult is the merger process in respect to your job?

Operator Questionnaire - Supervisor

#	Question	Response
100	Draw the process you or your staff follow to locate a person or service in XYZ Company.
101	Are flow charts available of the current process?
102	What are the performance service levels your staff is measured against?
103	What plans have been made to streamline the current process?
104	What would be the key benefits of a more robust or efficient system?
105	How do you perceive the new system would impact your staff?
106	What productivity levels would be increased with the new system?
107	What is the current process for updating the systems?
108	When your staff knows data is wrong or need to be updated, what is your course of action?
109	What information do you currently have difficulty getting or finding?
110	What percentages of calls go to what areas?
111	Where is the biggest problem area for your staff?
112	How do you currently handle these issues?
113	If service or business unit information were needed, what categories of information would you need to see?
114	Your staff needs to see?
115	If people information were needed, what categories of information would you need to see?
116	Your staff needs to see?
117	What hardware platform does your staff have?
118	What are the main issues you face when a merger occurs?
119	How do you handle those issues?
120	How often is the information updated?
121	Is information updated accurately?
122	How many information sources do you typically deal with?

Once these questionnaires were completed, the Project Team compiled the responses and divided the results into two categories:

Preliminary Findings dealt with the shortcomings of the systems in use. These issues would become the Functional Requirements of the new application.
Required Fields would be the actual data elements available for display and search from the new Web-based application.

Preliminary Findings. The preliminary findings gleaned from the completed questionnaires are summarized below:

There is no central source of information.
Too many systems are being used to look up information.
Department/product/service-related information is inaccurate, lacks detail, or is nonexistent.
Department names are not standard; a single department can have multiple names as well as multiple nicknames.
The physical location of employees and departments can not be easily discovered or identified.
E-mail information for employees is not readily available on the current lookup systems.
The systems currently in use are slow or cumbersome to use.
The search mechanisms used on existing systems are not robust.
There is a heavy dependence on paper lists instead of systems; paper lists play a big role in finding phone numbers.
Inaccurate information exists in all systems.
No reliable procedures are in place to update incorrect data in the systems.

It also became clear in the discovery process that each business area had unique, specific requirements that the application needed to fulfill. For example, operators and receptionists were more concerned with people-oriented information (the White Pages portion of the application). Service call centers were more concerned with product and service information (the Yellow Pages). Groups that travel extensively, such as marketing and management, needed the ability to access the directory information remotely, without having to connect to the corporate network.

Required Fields. The following employee information was deemed essential for the White Pages portion of the application:

Employee Phone Number
Employee Fax Number
Employee Name
Employee Nickname
Employee Title
Employee Admin Name
Employee Admin Phone number
Employee Scheduled Hours
Employee Mail Stop
Employee Floor Location
Employee Regional Association
Employee Manager Name
Employee E-mail address
Employee Forwarding Number

The following products and services information was deemed necessary for the Yellow Pages portion of the application:

Department Name
Business Unit Title
Department Phone number - General
Department Contact number - Internal /external
Department Contact number - VRU (Yes or No)
Department Phone Numbers instead of VRU listed by location
Department Hours of Operation
Department Fax Number
Department Nicknames
Department Description
Department Product
Department Products by Region, State, City
Department Comment
Department Members
Department Members - by Region
Department Mail Stop
Department Floor Location
Department Physical Address
Department Region
Department City
Department Manager Name
Department Manager Phone #
Department Director or Department Head
Department Director Admin Assistant Number
Department Sub Departments
Department Sub Departments - Description
Department Sub Departments - by Region
Department Org. Chart view

Far more fields than these were requested, but the Project Team knew they had to limit the fields to the most critical. In the elimination process, they considered questions such as:

What data source would the information be available from?
Was the data actually available for consumption in a database?

After reviewing these requirements, the Project Team realized they had to overcome a great many more challenges than they originally thought in order to meet the users' needs and have the new application considered a success.

Determining Existing Information Sources

Once the functional requirements for the application and the required data fields had been established, the next order of business was to determine which of the company's existing databases could be used to draw the information from, and whether the information was accurate enough to be used.

Since several applications were already being used to perform the lookup functions for the White pages (employee data), these were the first databases to be examined. In the case of product and service data for the Yellow Pages, no single database contained the needed information. Instead, each department and business unit maintained this information and generally did not centralize any of it.

Since the existing information sources were limited as to what data they could provide, the team found it necessary to par down what data would be included within the new application. At the same time, the team identified database applications that could be used to provide information in the future--provided they were cleaned up and properly maintained.

At this stage of the project, the team determined that the company was not in a position to implement real-time updates to the back-end data repository. Instead, there would be nightly updates that would reflect any changes in the selected databases. (An implementation of Novell's DirXML technology, which would provide the pipes for real-time updates, was placed on the list for consideration in the second release of the application.)

As the team examined the existing data sources, they found that many were inaccurate and not properly updated. Several were considered "self-service" databases, with employees updating their own information about themselves. In general, this self-serve approach had not been successful in keeping the data current. (This was a glaring example of where business processes needed to be changed in order to resolve these issues.) To further complicate matters, several of the existing sources that were earmarked for usage were antiquated "legacy" applications in the process of being phased out.

Out of the myriad of databases available, the team ultimately narrowed down the field to the following:

PeopleSoft for the majority of the White Pages (people) information
Ebase for e-mail information
TelBase for telephone information
REBase for location and address information
ProServBase for product and service information

With the exception of PeopleSoft, these were all proprietary databases that had been modified by XYZ Company for end user applications. The team decided that they were to be combined to allow population of all the requested required data fields for the new application. They also determined that while the back-end data repository would contain data from these sources, it would not become the source of record. The responsibility for data accuracy, maintenance, and administration would remain with the owners of the source databases.

Identifying Technology Requirements

While the Project Team conducted the initial discovery process, a Technology Team was tasked with determining the technological requirements needed to make the new application a success. This team identified the following basic performance criteria for the application:

The application was to be Web-based and accessible from a variety of Web browsers.
To access the required information should require no more than five mouse clicks and no more than three screen changes.
Searches were to performed and the results returned in less than 35 seconds.

The Technology Team divided the application into three major pieces:

The actual front-end application that would make the data available via a Web browser
A Web server to support the front-end application
A back-end or directory where the data would be stored

Although the company wanted to leverage existing technology and infrastructure where possible, they were willing to re-examine previous technology-related decisions and put money into new equipment if there were sound technical reasons to do so.

Front-End Requirements. The Technology Team put together a list of criteria that the selected front-end application should ideally meet. That list was then boiled down to the following must-have functions for the selected product.

It must be able to interface with and use existing LDAP v3-compliant directories.
It must correlate searches on existing LDAP v3-compliant directories, efficiently providing search capabilities with wildcards and with multiple search fields.
It must support LDAP v3 referrals.
It must support graded authentication to the data repository.
Data presented through the front-end must access existing LDAP v3 compliant directory in real-time.
It must allow data available for presentation to be customized based on user or group level access through graded authentication to the directory.
It must be ready to go "off the shelf," needing little customization prior to deployment.
It must ensure that information stored in existing LDAP v3-compliant directories remains secure, with a process in place to allow secure authorized updates to the information.
It must be customizable or modifiable to handle unlimited data fields and objects.
It must be customizable or modifiable to support new field requirements.
It must be able to graphically map relationships based on data from existing LDAP v3- compliant directories.
It must be able to graphically display location information.
It must be browser independent and function with all major browsers.

Web Server Requirements. The Technology Team recognized that the actual Web server decision would probably be based on the front-end decision. However, they did specify a few requirements for the Web server:

It must support multiple platforms.
It must be able to perform screen prints.

They also stipulated that the Web server be an "off the shelf" product such as Netscape's Web server or Microsoft's IIS. XYZ Company was not willing to add human resources or spend money on training costs in order to support a proprietary Web server.

Back-End (Directory) Requirements. As with the font end, the Technology Team assembled criteria for the back-end (directory). The requirements were pared down to the following list:

It must be based on LDAP v3 standards.
It must have the ability to support secure links in an SSL implementation.
It must be a firmly established product.
It must be scalable to millions, if not billions, of objects.
It must have an extensible schema.
It must have the ability to organize the directory into complex structures with many Organizations and Organizational Units.
It must be self-healing and self-replicating.
It must provide 100% reliability and up-time.
It must allow or make available real-time links to and from other databases.

As with the Web server, the Technology Team wanted the back-end to be a known commodity, as XYZ Company wanted to avoid retraining as much as possible. The team also wanted to use the existing infrastructure, if possible. However, they were willing to look outside of the existing technology, so they did not want to limit their technology choices by placing this restriction on the "required" list.

The Selection Process

Once the criteria were established for the application, XYZ Company sent Request For Proposals to companies that had expressed interest in the project. The requests included the requirements documentation and a questionnaire, and candidates were asked to respond in writing. Many companies replied, and XYZ Company narrowed down the candidates based on the responses to the questionnaires.

Front-End Selection

Through a weighted process of elimination, XYZ Company narrowed down the choices for the front end to one: Oblix. Oblix came closest to supporting all of the requirements "off the shelf," with limited customized coding. Other companies were able to provide similar applications, but a great deal of customization would be required.

Note: The Novell eGuide product was not available for release at the time this project was underway. Because eGuide performs all of the Functional Requirements the Project Team identified, it would have been a viable candidate.

Back-End Selection

Based on their support for the functional requirements, the choices for the back-end were narrowed down to two: Novell's NDS eDirectory and Netscape's Directory Server.

To determine which back-end was the best fit for the front-end application based on performance, the Project Team requested that the Technology Team test the products in a lab environment.

The objective of the tests was to determine which directory product demonstrated the best overall performance. Since the new application would mainly be searching the directory and displaying the results, the Technology Team was looking for high performance capabilities in this area. Secondary issues were ease of integration with Oblix and scalability.

The testing was divided into two phases: phase one would use a single container structure to house all of the records; phase two would use a multiple container structure. Both phases were required to handle over 200,000 objects.

Identical hardware was used for each directory server: the Dell PowerEdge 4300, a Pentium II 450 MHz CPU with 512 MB of RAM. Each server had 36 GB of disk space available, running in a RAID 5 array, with identical Intel 10/100 Ethernet network interface cards. Both Netscape and Novell approved the hardware configuration as being sufficient to handle a 200,000-record directory for their respective products.

The OS platform for Netscape Directory Server software was Microsoft Windows NT, while the platform for NDS eDirectory was NetWare.

Phase One Tests: Single Container Directory

For this phase of testing, XYZ Company's HR department provided the Technology Team with a file containing 207,004 live employee records. The team set up a simple, one-container directory structure in both NDS and Netscape to hold the employee data. The testing then proceeded as follows:

Extend the schema of both directories to accommodate the HR data.
Convert the HR file into an LDIF file and import it into the directories (NDS and Netscape) which serve as the data repository for the Oblix application to access.
Run intensive searches (random and sequential) against each directory to obtain a baseline for access speed and performance.
Install the Oblix application to access each directory.
Run intensive sequential searches against each directory via the Oblix interface to determine access speed and performance.
Capture data packets during the searches in order determine the efficiency of the directory searches and responses back to the client.

For a second series of searches, the Oblix interface was run on a single client. These sequential tests looked for finite, specific matches in only one data field. The results for this second series of tests were measured in seconds, not minutes.

Phase One Test Results. In phase one, the team encountered no issues in extending the schema of either directory. However, although Netscape imported the file more rapidly, it failed to import about 26,600 records from the command line interface. Unfortunately, no error log entries were recorded to indicate which records were rejected and what might have caused the rejections. NDS encountered no such difficulty, successfully importing all 207,004 records.

Overall, Netscape performed poorly during the baseline testing, consistently concluding random information search requests (name, address, phone number, name, fax number, and so on) several minutes (at least 5) behind NDS. The disparity in completion time became larger (30 to 40 minutes or greater) as more client workstations (up to 4) were added to request searches. The only instance where Netscape outperformed NDS was while using either one client (3 minute difference) or two clients (1 minute or less difference) making sequential (name, name , name...) information search requests.

Processor utilization on the Netscape/NT server was continually pegged at 100% for the duration of the search requests. The NDS/NetWare server held fairly steady at 40-65% utilization. After the team first observed the high utilization, they optimized the NT server to enhance performance and ran the tests again. However, this optimization did not significantly improve the results.

Due to the tremendous disparity of the baseline search results (most notably when multiple machines were employed), the Technology Team felt compelled to run this whole series of tests again. The results proved to be the same the second time.

In the second series of tests, NDS again completed the searches and returned the results back to the client more rapidly (0.2 to 100 seconds faster). The only instance where this was not true was when performing a "Full Name" search. Upon further research, the team discovered that Netscape was returning fewer matches back to the client (anywhere from 100 to 3000, depending on the search criteria) due to the 26,600+ rejected records in the initial import phase.

Again, processor utilization on the Netscape/NT server was continually pegged at 100% for the duration of the search requests. The NDS/NetWare server held at between 40-65% utilization.

By analyzing the captured data packets, the team ascertained that Oblix initiates an extra request sequence to the Netscape directory prior to initiating any new search. This request appears to communicate which directory access version Netscape should be prepared for (LDAP v2 or LDAP v3) before the search criteria is actually sent. NDS does not require this extra communication packet.

Phase Two Tests: Multiple Container Directory

The multiple-container approach used in this phase of testing was more representative of a real- world scenario. The single-container structure used in phase one is not a normal implementation. Similarly, the file from HR provided only a few key data elements, not the detailed information that would be the case when Oblix is implemented (regardless of the back-end chosen).

The team used the same hardware for phase two. They reinitialized all systems and reloaded the directory software in order to start with a clean slate. They then created a randomly-generated, detailed employee data file containing 200,000 records. Testing proceeded as follows:

Extend the schema of both directories to accommodate the new data.
Import the data file into the respective directories. (Since the file was produced in a standardized LDIF format that both directories could import, no conversion was required.)
Run intensive random and sequential searches against each directory to obtain a baseline for access speed and performance.
Install the Oblix application to access each directory.
Run intensive sequential searches against each directory via the Oblix interface to determine access speed and performance.
Capture data packets during the searches in order determine the efficiency of the directory searches and responses back to the client.

Phase Two Test Results. In the phase two tests, the team encountered no issues in extending the schema of either directory. However, they did encounter some difficulty importing the new data into the Netscape directory. To prevent Netscape from rejecting records as in phase one, the team used the Netscape directory console. It took over 10 hours to load and index the 200,000 records. NDS performed the same function, with the same file, in just under 3 hours.

The baseline search results were approximately the same in phase two, again favoring NDS. No issues were encountered integrating the Oblix application with either directory. Searches using the Oblix interface produced comparable results, favoring NDS. Packet capture results were the same in phase two.

Note: These findings are further supported by Key Labs, Inc., an independent testing laboratory, in their June 1999 report entitled "Scalability and LDAP Performance Comparison." This report details the results of testing the performance of Novell Directory Services 8 (eDirectory) against Netscape Directory Server 4, with similar results. Visit http://www.keylabs.com/for more information.

Summary of Testing Issues

In phase one, the team encountered one NDS-related issue in regards to the Oblix application's integration with NDS. It seems that in order for Oblix to install properly in a NetWare environment, the LDAP server can not be a Single Reference Time Server. No NDS-related issues were encountered in phase two of the testing.

In phase one, the team encountered four Netscape-related issues.

Over 26,600 records were rejected during the import. There was no apparent reason why this should have occurred, and no log file entries to help troubleshoot the issue.
Netscape searches exhibited poor performance. As more client workstations were added, the response time for the Netscape directory server degraded dramatically. Random search results were often minutes behind the same search with NDS. Using the Oblix interface did not improve search results. In a single client scenario, this actually catered to the one search where Netscape had outperformed NDS (fewer than two clients and sequential searches).
Processor utilization was extremely high during the searches. The Netscape server showed a utilization of 100% when any search was initiated, regardless of the number of clients making requests. As the load increased (more client workstations were added to make search requests), the server utilization stayed at 100% the entire time it took to finish performing all of the search requests. This might explain the poor performance experienced during the searches. Subsequent optimization of the Netscape server did not improve the situation.
An extra packet was sent from Oblix to the Netscape directory prior to every new search. While this may be the least problematic of the listed issues, any unneeded communication packets should be eliminated.

In phase two, the team encountered two Netscape-related issues:

Loading the data through the Netscape directory console required over 10 hours to complete. No record rejections occurred; however, this amount of time seems excessive.
Processor utilization was extremely high during the searches. This is especially significant in light of the fact that a maximum of four client workstations were used during the search request testing--a very small number when compared to what the "real world" usage would actually be.

Back-End Testing Conclusions

Based on these tests, XYZ Company concluded that Netscape did not meet the required objectives for the project. Netscape directory performance was not equal to NDS returning client requests, running anywhere from 0.02 seconds to 40 minutes behind NDS, depending on the search criteria. The Netscape directory was overworked during the searches, running at 100% utilization for the duration of the tests and often appearing to be frozen. Prior to phase one testing, Netscape failed to import over 26,600 of 200,000 records. Using the Netscape directory console to import the records in phase two, the load time for 200,000 records was over 10 hours.

On the other hand, NDS eDirectory met or exceeded all of the required objectives. NDS returned search requests back to the client more rapidly. Server utilization did not exceed 65% during any of the search requests, indicating that an increased load could probably be placed on the directory with no degradation in performance.

Conclusion

Based on the information gathered from the discovery process, Oblix suited XYZ Company's needs for the front-end, handling all of the functional requirements that were deemed "must have." As the back-end, NDS eDirectory outperformed the Netscape directory and easily handled the required data fields being added to the schema.

XYZ Company recognized that they already had an extensive NDS tree structure, which they would be able to take advantage of to deploy the new application. This would allow them to minimize hardware costs by retaining the existing NetWare servers and needing to purchase only several instances of the Oblix/Web server equipment, which are housed on the same machine.

* Originally published in Novell AppNotes

Disclaimer

The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.