Tech Info 110: TCP/IP performance problem between Mac OS X 10.4 and Solaris 10

HELIOS Tech Info #110

Fri, 28 Sep 2007

TCP/IP performance problem between Mac OS X 10.4 and Solaris 10 

We've been reported from multiple customers a major performance problem that occurs only within the following environment:
Server: Sun Solaris 10 SPARC or X86 (fine, e.g. with Solaris 8, AIX, OS X)
Client: Mac OS X 10.4 (fine, e.g. with OS 9 or Mac OX 10.3)
Routed network only: 
Layer 3 switches or routers between the server and clients (no problems with directly connected clients, e.g. non-routing via a switch)

The problem: 

The problem occurs when the Mac client reads data in larger chunks, e.g.: 64k byte blocks from the server, and from time to time the server stops sending data. The TCP/IP protocol trace shows in this case that the server started sending a few packets (3-6 packets) of the read response and suddenly waits about 0.2 seconds until the Mac sends an TCP/IP ACK for the received data, then the server continues with more data.
Interesting is that the Sun TCP does not include a PushAck flag in the previous packets when it stops sending data, the PushAck flag is only set when the last data packet response of the 64k byte write goes to the client.
This ongoing waiting for the ACKs from the Mac client results in a very slow network performance which only occurs in the above listed client-server routed network environment. In all test cases the available TCP/IP window size on the server and client was more than 128kB. We believe that something is wrong with the Solaris 10 TCP/IP protocol because it waits for ACKs before sending more data.

Three available workarounds:

  1. Connect the Mac OS X 10.4 clients directly via a standard switch to a server network interface without routing
  2. Set the EtherShare AFP blocksize to 8 or 16kB instead of the default 128 kB, via:
    prefvalue -k "Programs/afpsrv/dsiblocksize" -t int 16384
  3. Change Mac 10.4 client ACK behavior from 3 to 2 via (Note that this change is not permanent. It is lost after rebooting!):
    sysctl -w net.inet.tcp.delayed_ack=2

    To make this change permanent, add the following line to “/etc/sysctl.conf”:
    net.inet.tcp.delayed_ack=2
The problem can be reproduced with the default HELIOS LanTest 30 MB read tests. Uncheck all test items except for the read test and the problem will occur after several runs. We reported the problem to Sun.