Error: "Socket Read Failure"
(Last modified: 06Feb2003)
This document (10010083) is provided subject to the disclaimer at the end of this document.
fact
Novell GroupWise 5.2
Novell GroupWise 5.5
Novell GroupWise Message Transfer Agent
symptom
Error: "Socket Read Failure"
Errors occur sending large attachments
Errors occur when sending through slow WAN links
Message Transfer Agent stops processing messages
Message Transfer Agent can receive mail
Message Transfer Agent cannot send mail
Error: 8913 "Socket Write Failure"
Error occurs on the receiving MTA
cause
The GroupWise Message Transfer Agent uses TCP to communicate, which is an acknowledgement connection dependant protocol. When TCP packets are sent, they require a corresponding ACK (acknowledgement). When IP packets are dropped at the router, the acknowledgement does not take place. GroupWise MTA will resend packets until they establish a connection. These resends cause the IP buffer to fill. The IP stack responds to the full buffer by putting the window size to 0 to allow the ip stack to recover. The MTA can not tell when the window size is reset and hangs the TCP threads. If this happens enough times, the agent will stop processing message because of the lack of available threads. 100% of the time, this scenario arises because of an invalid MTU table on the server that is creating the TCP packet. Too large of a packet is negotiated which results in dropped packets and eventually the above errors.
REASONS FOR AN INVALID MTU TABLE:
1. Bad LAN drivers. The path MTU algorithm queries all of the LAN drivers to setup the MTU table. If there are old or outdated LAN drivers on the routers, servers, etc, incorrect values are reported.
2. Static settings on routers, switches, etc. In one example, the customer had a FDDI card in a server that was on an Ethernet network. FDDI packets on a FDDI network can be as large as 4202 bytes. Ethernet packets can be no longer than 1514 bytes. The server had a set parameter, "set maximum physical receive packet size = 4200". This setting was sent to the MTU table on the sending server. 4200 byte packets hit the server and of course it was dropped because it was too large for the Ethernet network.
3. Bad hardware. Bad NIC cards in servers, routers, and etc, bad switches or hubs can also report incorrect data.
fix
Bypass all LAN drivers, by setting a maximum physical receive packet size packet on the sending server. This can be done by adding the following set parameter in the startup.ncf file:
"set maximum physical receive packet size = [byte size]"
The "byte size" mentioned above is something that will have to be guessed. Because at this point, it is unknown who or what is dropping the packets, it is impossible to determine why or what the limits should be. It is recommend to start with 800 bytes. Move to 1000, 1200, etc until 1500 is reached on Ethernet or 4200 on a FDDI network. This can be used as a work around until the culprit can be identified and the problem fixed. Finding the component that is causing the problem is up to the customer. This is out of the scope of Novell's NTS department.
A less recommended procedure can also be used. Novell recommends the above suggestion over the following. The main reason for this is that the following recommendation disables the Path MTU algorithm which could affect performance on the network.
1. Download the latest TCP/IP Stack for NetWare 4.11 (TCPN05.EXE or later)
This latest version of the tcpip stack supports the following set parameters:
set use specified mtu = on
set maximum interface mtu = [bytes]
Byte Range = 576 - 4202.
This setting will cause all IP packets, regardless of environment to be the static size specified. This is inefficient and should only be used as a last resort. Always keep the type of network in mind. Packets should never be greater than 1514 on an Ethernet network.
Another consideration is as follows:
Many newer NIC's default to automatically negotiate with the router to determine whether to run with Full or Half Duplex. Apparently this does not always return optimum results. In this case the NIC had negotiated Half Duplex when the router was probably setup for Full Duplex.
Reconfigure NIC from Auto-Negotiate to 100Base-TX/Full Duplex
document
Document Title: | Error: "Socket Read Failure" |
Document ID: | 10010083 |
Solution ID: | 4.0.20805.1675983 |
Creation Date: | 23Jun1998 |
Modified Date: | 06Feb2003 |
Novell Product Class: | Connectivity Products Groupware NetWare Novell BorderManager Services |
disclaimer
The Origin of this information may be internal or external to Novell. Novell makes all reasonable efforts to verify this information. However, the information provided in this document is for your information only. Novell makes no explicit or implied claims to the validity of this information.
Any trademarks referenced in this document are the property of their respective owners. Consult your product manuals for complete trademark information.