High Throughput Best Practices
This section summarizes our findings and recommends best practices to tune the different layers of the OnixS .NET FIX Engine for high-throughput workloads. Please note that the specific benefits and effects of each of these configuration choices will be highly dependent upon the particular applications and workloads, so we strongly recommend experimenting with the different configuration options with your workload before deploying them in a production environment.
Session tuning-up
There are many settings in the Session class, and this section covers only the subset that relates to high-throughput workloads.
Selecting Right Session Storage
Please see the corresponding section in Low Latency Best Practices.
Reusing Message Instance for Incoming Messages
Please see the corresponding section in Low Latency Best Practices.
Message Grouping
Network utilization and latency are usually inversely proportional. Smaller packets will be transmitted over the network faster and therefore will have lower latency. However, many smaller packets require higher network overhead (IP headers and Ethernet headers) than fewer larger packets. The Engine exposes an ability to strike the right balance to divide up the messages into groups that will give you some efficiency in the network and on the receiver side. Each session can be configured using the MessageGrouping property. The table below describes possible values of MessageGrouping option.
Value | Description |
---|---|
0 |
Default. Messages are sent ASAP. Pending messages group until TCP buffer size is reached. |
1 |
Messages are sent ASAP and are never grouped. |
2 or higher |
Messages are sent ASAP. Pending messages group up to the specified number, if higher). |
Updating Engine configuration
Disabling Resend Requests functionality
Please see the corresponding section in Low Latency Best Practices.
Disabling FIX Messages Validation
Please see the corresponding section in Low Latency Best Practices.
Optimizing FIX Dictionaries
Please see the corresponding section in Low Latency Best Practices.
Transport Layer and the TCP/IP Stack
There are many options in the protocol stack that can affect the efficiency of the data delivery. You must understand the characteristics of the version of the TCP/IP Stack you are running, and that they are compatible with the versions and options on the other stacks.
Enable the Nagle's algorithm (disable TCP_NODELAY)
Nagle’s algorithm is beneficial for minimizing network overhead by concatenating packets together. We strongly recommend enabling Nagle’s algorithm for high-throughput workloads. Please see TcpNoDelay and code snippet below for more details.
C#:
EngineSettings settings;
settings.TcpNoDelay = false;
Adjust TCP windows for the Bandwidth Delay Product
TCP depends on several factors for performance. Two of the most important are the link bandwidth (the rate at which packets can be transmitted on the network) and the round-trip time, or RTT (the delay between a segment being sent and its acknowledgment from the peer). These two values determine what is called the Bandwidth Delay Product (BDP).
Given the link bandwidth rate and the RTT, you can calculate the BDP, but what does this do for you? It turns out that the BDP gives you an easy way to calculate the theoretical optimal TCP socket buffer sizes (which hold both the queued data awaiting transmission and queued data awaiting receipt by the application). If the buffer is too small, the TCP window cannot be fully open, and this limits performance. If it's too large, precious memory resources can be wasted. If you set the buffer just right, you can fully utilize the available bandwidth. Let's look at an example:
BDP = link_bandwidth * RTT
If your application communicates over a 100Mbps local area network with a 50 ms RTT, the BDP is:
100Mbps * 0.050 sec / 8 = 0.625MB = 625KB
So, set your TCP window to the BDP, or 625KB. But the default window for TCP in .NET is 8KB, which limits your bandwidth for the connection to 160KBps, as we have calculated here:
throughput = window_size / RTT
8KB / 0.050 = 160KBps
If instead, you use the window size calculated above, you get a whopping 12.5MBps, as shown here:
625KB / 0.050 = 12.5MBps
That's quite a difference and will provide higher throughput for your socket. So, now you know how to calculate the optimal socket buffer size for your socket. But how do you make this change?
The Sockets API provides several socket options, two of which exist to change the socket send and receive buffer sizes. The code snippet below shows how to adjust the size of the socket send and receive buffers with the SendBufferSize (SO_SNDBUF) and ReceiveBufferSize (SO_RCVBUF) options.
For example:
var settings = new EngineSettings
{
SendBufferSize = 625 * 1024,
ReceiveBufferSize = 625 * 1024
};
Engine.Init(settings);
Although the socket buffer size determines the size of the advertised TCP window, TCP also maintains a congestion window within the advertised window. Therefore, because of congestion, a given socket may never utilize the maximum advertised window.
Note
TCP for Windows 2000 and higher also supports Windows Scaling, as detailed in RFC 1323, TCP Extensions for High Performance. Scaling enables TCP to provide a receive window of up to 1 GB.
A larger buffer size might delay the recognition of connection difficulties.