TCP/IP performance tuning

I was getting 70 MBps network IO performance for my TCP/IP based RPC program on 1Gb network. I ran the same program on 10GB network, I was expecting minimum 7-8X performance gain. But to my surprise the gain was merely 10% only.

I was using below settings to my TCP client and server socket.

// Set send buffer size
 setsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, &sendbuff, sizeof(sendbuff));
 // Set receive buffer size
 setsockopt(sockfd, SOL_SOCKET, SO_RCVBUF, &recvbuff, sizeof(recvbuff));
 // Set no delay option
 int flag = 1;
 setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, &flag, sizeof(flag));
 // Set keepalive socket
 int flag = 1;
 setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &flag, sizeof(flag));

I started doing some random experiments by turning on/off socket settings. I achieved the 7-8X performance gain when I disabled socket send buffer and receive buffer sizes.

When I started exploring more regarding this I came across TCP autotuning concept. More about this can be found here:

Important notes from the above page:

TCP Autotuning automatically adjusts socket buffer sizes as needed to optimally balance TCP performance and memory usage. Autotuning is based on an experimental implementation for NetBSD by Jeff Semke, and further developed by Wu Feng’s DRS and the Web100 Project. Autotuning is now enabled by default in current Linux releases (after 2.6.6 and 2.4.16). It has also been announced for Windows Vista and Longhorn. In the future, we hope to see all TCP implementations support autotuning with appropriate defaults for other options, making this website largely obsolete.

NB: Manually adjusting socket buffer sizes with setsockopt() disables autotuning. Application that are optimized for other operating systems may implicitly defeat Linux autotuning.

Do not use setsockopt() to set send / receive buffer sizes unless you’ve found out the buffer sizes for your application which will out perform the TCP auto tuning. In general cases it is better to rely on the TCP autotuning.


Berkeley DB performance and disk configuration

In this post I want to highlight the impact of underlying disk configuration on the performance of Berkeley DB (BDB).
The performed test run are as below:
1. The BDB is B_TREE. It has only Key and not data. Key length is 24 bytes.
2. A random key is generated it is looked up for existence in the db, if it does not exist in db, it is added in the db.
3. Same set of keys in same sequence were generated for below tests.
4. First the performance was calculated on db stored on a single hard disk.
5. Then same performance test were done on db stored on single SSD.
6. Last the performance test were done on db stored on striped volume consisting 3 disks.

Check the plotting of performance graph:

Disk configuration impact on BDB performance

Disk configuration impact on BDB performance

The x-axis in the graph has Size in TB, translate the 1 unit on x axis to ~ 8 million unique keys in BDB.

These tests were done to decide a storage appliance configuration. To keep the cost of appliance low we decided to use striped volume instead of SSD. We can get even better performance if the DB is stored on striped volume created on top of SSDs.