Low Latency Best Practices

Hardware and Middleware

One of the most efficient ways to reduce the latency is to use specialized network cards (e.g., Solarflare) and user-space TCP stack implementations (e.g., Onload).

Selecting Session Storage

Using MemoryBasedStorage instead of FileBasedStorage boosts performance since SBE messages are stored directly in memory.

Alternatively, it's possible to use pluggable storage (PluggableStorage) that does nothing on message-related operations.

You can also use the Asynchronous File-Based Session Storage if you need to keep the file-based storage functionality and excellent performance.

Threads Tuning

Affinity

By default, session threads can be executed on any of the available processors/cores. Specifying CPU affinity for each thread may give a significant performance boost:

int[] receivingCpuIndexes = new int[] { 1 };
int[] sendingCpuIndexes = new int[] { 2 };

session.ReceivingThreadAffinity = receivingCpuIndexes;

// If the message cannot be sent immediately, then it is saved to the queue
// for the subsequent sending by the sending thread.
session.SendingThreadAffinity = sendingCpuIndexes;

Note

Ideally, each spinning thread should run on a separate CPU core so that it will not stop other important threads from doing work if it blocks or is de-scheduled. If more than one spinning thread shares the same CPU core, it could significantly increase jitter.

Priority

To modify threads priorities, use SendingThreadPriority, and ReceivingThreadPriority properties.

Spinning (busy‐wait)

Note

The user-space TCP stack spinning is usually more efficient than the built-in spinning (e.g., Onload's EF_POLL_USEC environment variable or the latency‐best profile).

Warmup

If the session sends SBE messages infrequently, the sending path and associated data structures will not be in a CPU cache, and this can increase the message sending latency.

One can periodically (a recommended interval is 500 microseconds or less) call the WarmUp(IMessage, SocketFlags) to avoid cache misses and keep the sending path fast.

Receive Spinning Timeout

When a session receiving thread attempts to read from a network and no incoming messages are available, the thread will enter the OS kernel and block (so-called "blocking wait" mode). When an incoming message becomes available, the network adapter will interrupt the CPU, allowing the OS kernel to reschedule the thread to continue.

Blocking, interrupts, and thread context switches are relatively expensive operations and can negatively affect the latency.

The session can be configured to spin on the processor in user mode for up to a specified number of microseconds waiting for messages from the network using the ReceiveSpinningTimeout property. If the spin period expires, the session will revert to normal blocking behavior.

ReceiveSpinningTimeout usage makes sense when the session receives SBE messages frequently; in this case, waiting in the loop is cheaper than the thread context switch to the "blocking wait" mode.

Note

The spin wait increases the CPU usage, so the spin wait period should not be too long.

Send Spinning Timeout

The SendSpinningTimeout property can be used to decrease the latency of the message sending.

If the value is zero (by default) and the outgoing message cannot be sent immediately, it is saved to the outgoing queue. If the value greater than zero, the Send(IMessage) method waits for the socket sending buffer availability in the spin loop mode before placing the message to the outgoing queue (to be sent later by the sending thread).

SendSpinningTimeout usage makes sense when the session sends SBE messages frequently; in this case, waiting in the loop is cheaper than the thread context switch.

Note

The spin wait increases the CPU usage, so the spin wait period should not be too long.

Logging After Sending

By default, the logging of an outgoing message to the session storage is performed before sending it to the wire. This approach is more reliable because the outgoing message is stored before going to the counterparty. However, this approach adds the logging latency to the message sending latency, so it increases the tick-to-trade latency.

When the latency is more important, one can switch off the logging before sending, by setting the LogBeforeSending option to false. In this case, the logging of outgoing messages to the session storage will be performed after sending them to the wire.

Reusing Message Instances

Object creation is an expensive operation, with impact on both performance and memory consumption. The cost varies depending on the amount of initialization that needs to be performed when the object is to be created. The Session exposes an ability to reuse message instances and event arguments in event handlers by the Session. We highly recommend to turn on xref:OnixS.Cme.ILink3.Session.ReuseInboundMessage and xref:OnixS.Cme.ILink3.Session.ReuseEventArguments to minimize the excess object creation and garbage collection overhead.

Note

If ReuseInboundMessage turns on, the client's code must copy a message for using outside of inbound callbacks. If ReuseEventArguments turns on, the client's code must copy event arguments for using out of callbacks.

Low Latency Best Practices

Hardware and Middleware

See Also

Selecting Session Storage

Threads Tuning

Affinity

Note

Priority

Spinning (busy‐wait)

Note

Warmup

Receive Spinning Timeout

Note

Send Spinning Timeout

Note

Logging After Sending

Reusing Message Instances

Note