High Throughput Best Practices

This section summarizes our findings and recommends best practices to tune the different layers of the OnixS .NET Framework FIX Engine for high-throughput workloads. Please note that the exact benefits and effects of each of these configuration choices will be highly dependent upon the specific applications and workloads, so we strongly recommend experimenting with the different configuration options with your workload before deploying them in a production environment.

Session tuning-up

There are many settings in the Session class, and this section covers the only subset that relates to high-throughput workloads.

Selecting the right storage

Please see the corresponding section in Low Latency Best Practices.

Reusing FIX message instance for incoming messages

Please see the corresponding section in Low Latency Best Practices.

Message grouping

Network utilization and latency are usually inversely proportional. Smaller packets will be transmitted over the network faster and therefore will have lower latency. However, many smaller packets require greater network overhead (IP headers and Ethernet headers) than fewer larger packets. OnixS .NET Framework FIX Engine exposes an ability to strike the right balance to divide up the messages into groups that will give you some efficiency in the network and on the receiver side. Each session can be configured using the MessageGrouping property. The table below describes the possible values of the MessageGrouping option.

Value	Description
0 (default)	The messages will be sent as soon as possible, and pending messages (if any) will be grouped till reaching of TCP buffer size.
1	The messages will be sent as soon as possible and never will be grouped.
2 (or greater)	The messages will be sent as soon as possible, and pending messages will be grouped maximum by 2 (or greater).

Updating Engine configuration

Disabling Resend Requests functionality

Please see the corresponding section in Low Latency Best Practices.

Disabling FIX messages validation

Please see the corresponding section in Low Latency Best Practices.

Optimizing FIX dictionaries

Please see the corresponding section in Low Latency Best Practices.

Transport Layer and the TCP/IP Stack

There are many options in the protocol stack that can affect the efficiency of the data delivery. You must understand the characteristics of the version of the stacks you are running, and that they are compatible with the versions and options on the other stacks.

Enable the Nagle's algorithm (disable TCP_NODELAY)

Nagle’s algorithm is very useful for minimizing network overhead by concatenating packets together. We strongly recommend enabling Nagle’s algorithm for high-throughput workloads. Please see TcpNoDelayOption and code snippet below for more details.

Copy

EngineSettings settings;

settings.TcpNoDelayOption = false;

Adjust TCP windows for the Bandwidth Delay Product

TCP depends on several factors for performance. Two of the most important are the link bandwidth (the rate at which packets can be transmitted on the network) and the round-trip time, or RTT (the delay between a segment being sent and its acknowledgment from the peer). These two values determine what is called the Bandwidth Delay Product (BDP).

Given the link bandwidth rate and the RTT, you can calculate the BDP, but what does this do for you? It turns out that the BDP gives you an easy way to calculate the theoretical optimal TCP socket buffer sizes (which hold both the queued data awaiting transmission and queued data awaiting receipt by the application). If the buffer is too small, the TCP window cannot fully open, and this limits performance. If it's too large, precious memory resources can be wasted. If you set the buffer just right, you can fully utilize the available bandwidth. Let's look at an example:

BDP = link_bandwidth * RTT

If your application communicates over a 100Mbps local area network with a 50 ms RTT, the BDP is:

100Mbps * 0.050 sec / 8 = 0.625MB = 625KB

So, set your TCP window to the BDP, or 625KB. But the default window for TCP in .NET is 8KB, which limits your bandwidth for the connection to 160KBps, as we have calculated here:

throughput = window_size / RTT

8KB / 0.050 = 160KBps

If instead, you use the window size calculated above, you get a whopping 12.5MBps, as shown here:

625KB / 0.050 = 12.5MBps

That's quite a difference and will provide greater throughput for your socket. So, now you know how to calculate the optimal socket buffer size for your socket. But how do you make this change?

The Sockets API provides several socket options, two of which exist to change the socket send and receive buffer sizes. The code snippet below shows how to adjust the size of the socket send and receive buffers with the SendBufferSize (SO_SNDBUF) and ReceiveBufferSize (SO_RCVBUF) options.

Copy

EngineSettings settings;

settings.SendBufferSize = 625 * 1024;
settings.ReceiveBufferSize = 625 * 1024;

Although the socket buffer size determines the size of the advertised TCP window, TCP also maintains a congestion window within the advertised window. Therefore, because of congestion, a given socket may never utilize the maximum advertised window.

Note
TCP for Windows 2000 and higher also supports Windows Scaling, as detailed in RFC 1323, TCP Extensions for High Performance. Scaling enables TCP to provide a receive window of up to 1 GB.

Note
A larger buffer size might delay the recognition of connection difficulties.

Other Resources

Low Latency Best Practices