Troubleshooting High Utilization from an eDirectory perspective

  • 7001413
  • 23-Sep-2008
  • 16-Oct-2013

Environment

Novell NetWare 6.5
Open Enterprise Server
Novell eDirectory 8.8 for All Platforms
Novell eDirectory 8.7.3 for All Platforms

Situation

eDirectory High Utilization

Excessive bindery requests

High valued attributes

Resolution

Troubleshooting eDirectory High Utilization
  • General Troubleshooting
  • Bindery Requests
  • Multi-valued Attributes
  • SVC TYPEID (under legacy information)

General Troubleshooting Steps

These steps can be used to troubleshoot where the utilization is coming from. 

  1. Drop outbound sync threads to 2.  (iMonitor --> Agent Configuration --> Agent Synchronisation) This is useful in high load situations. Recommendations for tuning eDirectory.

  2. Preallocate the static cache configured.  (KB 10097143)

  3. Set database cache to have a hard limit. iMonitor | agent configuration | database cache. See the eDirectory online documentation. Do not set the memory over 1GB on a 32bit eDirectory Binary. (www.novell.com/documentation)

  4. Check iMonitor Agent Activity --> Background Process Schedule to confirm if any background process is chewing up CPU cycles.

  5. Confirm that the indexes you have created are all online and are being used for the queries in question. This is different from seeing the index online using traditional index management tools .Essentially you need to turn on RECM (Recman) traces and confirm. There are no messages like "Index Auto is offline. Indexing through record Auto2"; and that The LDAP queries result in desired indexes being used or any index being used.

  6. Check that Advanced Referral Costing is enabled. (iMonitor Advanced Mode --> Agent Configuration --> Permanent Settings --> Change --> Advanced Referral Costing = 1)  This only effects server to server communication. If a LDAP server doesn't hold replicas, it will do costing to see who the nearest server is. Advanced Referral Costing helps confirm that the server it is communicating with is the fastest server with real copies of the information it is looking for. See the eDirectory 8.8 online documentation for more information.

  7. If NMAS is being used, make sure you are on the latest supported Security Services Patch level.

  8. Go to iMonitor | agent configuration | agent synchronization. Disable outbound synchronization. Does this server's utilization drop or other servers? Disable inbound synchronization, does the server's utilization drop? Disable both inbound and outbound synchronization, does this make a difference? If so, troubleshoot replication through dstrace. Turn on the outbound synchronization, outbound sync details, inbound syncronization and inbound sync detail options in dstrace. If objects are synchronizing over and over, you may need to go through the ANT process. (dsrepair -ANT, see KB 7000563).

  9. Unplug the network cable from the server having high utilization. Does utilization drop? If so, and if step 7 didn't effect utilization when disabling synchronization, then the high utilization is coming from the wire.
    • Turn on dstrace with the dsa tag. (Directory Services Agent) You may want to also turn on agent buffers in trace.The trace isn't easy to read, but should give an indication as to what attributes are being looked at, and what connection is associated with the query.
    • On a NetWare box, go into NORM, https://<server-ip-addr>:8009 | Diagnose Server | Profile / Debug | you now see the busiest thread(s). Watch this to see if it is consistently the same thread. Click on the thread to see more information. It may indicate what could be causing high utilization.



Troubleshooting Bindery Requests

-        This is usually seen coming from the network
-        This can be verified by a DSTRACE.

Setup for trace:
  1. SET DSTRACE = +EMU (Emulate Bindery) to track down Bindery calls to NDS which are known to cause high utilization.
  2. SET DSTRACE = ON
  3. SET DSTRACE = NODEBUG
  4. SET TTF = ON
  5. SET DSTRACE = *R
  6. SET DSTRACE = +EMU
  7. Allow this to run for several minutes during the high utilization time.  Then turn off. SET TTF = OFF
  8. Reading the log
  • DSTRACE.DBG with +EMU filter
  • Find Next Object, btmatch=3 (mask=8), lastobj==ffffffff   )  for conn 7
  • Done with Find Next Object, returning== 8466, type=3 )  Error=succeeded
  • Find Next Object, btmatch=3 (mask=8), lastobj==8466  PRINTQUE.NOVELL.TREE  )  for conn 7
  • Done with Find Next Object, returning== 0, type=0 )   Error=failed, no such object (-252)
  • Examine DSTRACE.DBG for multiple requests. In the log above we found connection 7  (conn7) was requesting a bindery type of 0003 (btmatch=3), which is a bindery print queue.  We also know that the bindery print queue is PRINTQUE.NOVELL.TREE and has an EID of 8466 (returning = 8466).  When looking at a report if you see a large number of requests for btmatch=3, we can determine that the print queue is using a large amount of the bindery service time.  Another type that has been a problem in the past is the 1200 (Arcserv queue).

 

Troubleshooting Multi-valued attributes
-        This is usually seen coming from the network, or locally
-        This can be verified with DSBROWSE or DSREPAIR

Finding the high valued attribute:

On NetWare:

  1. DSBROWSE -CV[num], where [num] is the minimum number of multi-values to display. E.g. DSBROWSE -CV1000.
  2. Use the search request form and do a normal search. (To find everything with high values, enter * as the name of the object and hit F10.)
  3. The log files .are found in the directory SYS:\SYSTEM directory and the name is valcnt[num].log
On All Platforms: Use iMonitor Reports. See TID 10090648.


Reading the report -The following are attributes to look for that contain a high value, and the steps to fix this.

  1. Reference attributes will commonly show up with a high number of values.  These should not be considered a problem.

  2. NLS:Cert Peak Used Pool is a problem if it shows up in the report with a high number of values.To fix this you will need to remove the licenses and reinstall them back into the tree. To prevent the Novell Licensing Service from updating Directory Services with licensing data, use NLSLSP.NLM dated NOVEMBER 26, 2001 or later and use the following set command "SET NLS REPORT DATA = OFF".

  3. WM:Registered Workstation attribute is a problem if it shows up in the report with a high number of values.  To fix this you can use DSREPAIR -WM, and run a local dsrepair.  This will delete all the values in the attribute. But this action will NOT synchronize the deleted values to other replicas. You should then use the command on the other servers also holding replicas of the partitions where the command was used.  This is a temporary workaround because the values will be re-added next time the user logs in until all workstations are imported.

  4. Third Party attributes that show up in the report could also be an issue.  We have found some in the past and there may be more.  You may have to delete the attribute and see if the utilization comes down.  This may however cause a problem with the Third Party App by deleting some of its attributes.  Always check in DUMP or DSBROWSE for the attributes if they are suspected.

Additional Information

Note: Unloading the eDirectory service is not a good way to see if eDirectory is causing high utilization. The cause could be from a service running that is doing a mass number of client search requests against the directory.


Legacy information

Please refer to KB 10011512 for more information regarding NetWare 5.x and 4.x servers, and high utilization.

Checking High Value counts on NDS 7.x which shipped with NetWare 5.x.
  1. DSREPAIR  5.23e or newer for NDS 7.x
  2. DSREPAIR -CV[num], where [num] is the minimum number of multi-values to display. E.g. DSREPAIR -CV1000.
  3. The report will append to the DSREPAIR.LOG in the SYS:\SYSTEM directory.
  4. When you run the report you should check for values above 500.

Legacy information: If the server has high values for the NLS:Cert Peak Used Pool attribute, you can use  DSREPAIR 85.13 you can run a DSREPAIR ·NLC to clear this attribute.  This repair needs to be run on every server simultaneously to prevent synchronization of the attributes. Note, the DSREPAIR -NLC switch is not present in eDirectory 8.6.2 and greater.


Troubleshooting SVC TYPEID 
(issue Novell Support has not seen in a long time)

  1. This is usually seen coming from the network
  2. This can be verified by a DSTRACE.
  3. Finding SVC TYPEID requests


If this is suspected to be causing high utilization then you can verify this using DSTRACE.  Using the following DSTRACE commands you can verify this issue.

SET DSTRACE = ON
SET TTF = ON
SET DSTRACE = +DSA
SET DSTRACE = +BUFFERS
Wait 5-10 minutes
SET DSTRACE = OFF
SET TTF = OFF

Next you will want to search the DSTRACE.DBG file for any 603 errors.  Generally the ·603 error will also report the SVC TYPEID.  If you are seeing this problem, the SVC TYPEID will have many instances in the log with the same connection number. If the Novell Client is not up to 4.9sp4, patch the client. This issue has been seen primarily with the 4.8 client.


NLS Metering

How to disable nls metering (known to cause high utilization)
rename the below three NLMs, then unload them.
  1. unload connaud.nlm
  2. unload nlslrup.nlm
  3. unload nlsmeter.nlm

Quality of Service (QoS):

The  Quality of Service parameters are for bindery requests.
set NDS Bindery QOS Mask  (Must be set for the bindery type causing high utilization)
set NDS Bindery QOS Delay (Specifies the delay in ms for the selected types in the mask)

Most issues are caused by bindery print queues therefore you will set the following, at the server console:                       
set NDS Bindery QOS Mask = 8       (THIS IS THE DEFAULT)                       
set NDS Bindery QOS Delay = 150   (THIS IS THE DEFAULT)

The default for NDS Bindery QOS Delay in NDS v8.59 is 150. The default for NDS Bindery QOS Mask will be set to 8.

To throttle ALL Bindery types use the following values, as a test to determine if this is the problem, at the server console:                       
set NDS Bindery QOS Mask = -1                       
set NDS Bindery QOS Delay = 1500

To view the current values, at the server console:   
set NDS Bindery QOS Mask                       
set NDS Bindery QOS Delay



Formerly known as TID# 10068952