OnixS C++ CME Streamlined Market Data Handler  0.1.2
API documentation
Low Latency Best Practices

The given topic uncovers how to configure the Handler in order to achieve maximal performance characteristics and lowest processing latency.

Configuring Logging Subsystem

Under normal conditions, the Handler logs important events and market data transmitted by MDP into a log file. As far as logging entries represent textual information, binary data like incoming market data packets are encoded using base64-encoding before stored in a log. That adds extra time to a processing cycle. Finally, if the Logger implementation stores its data into a file, that may be a relatively slow operation.

If the users want to eliminate slowdowns caused by flushing data to filesystem and/or extra encoding operations, they can disable logging by binding instance of the OnixS::CME::Streamlined::MDH::NullLogger class to the Handler. In such case log events are not constructed by the Handler, and nothing is logged at all.

Turning Up Working Threads

Market data processing is done asynchronously by using working threads. Under normal conditions, threads are executed on any processor available in the system. That may have negative influence on overall performance due to unnecessary thread context switching.

To avoid switching threads between processors, the Feed Engine as a manager of working threads, allows to establish processor affinity for each working thread:

FeedEngineSettings feSettings1;
FeedEngineSettings feSettings2;
// Scheduling Feed Engines to use different processors.
feSettings1.threadAffinity().insert(1);
feSettings2.threadAffinity().insert(2);

In addition to the ability to manipulate thread affinity for working threads of the Feed Engine, it also provides a set of events triggered by working threads at the beginning of processing and before ending a processing loop. See Feed Engine Events for more information.

With the help of working threads events, it is possible to perform more advanced thread turning like updating thread priority:

struct ThreadPriorityManager : FeedEngineListener
{
void onFeedEngineThreadBegin(const FeedEngine&)
{
SetThreadPriority(
GetCurrentThread(),
THREAD_PRIORITY_TIME_CRITICAL);
}
};
FeedEngineSettings feSettings;
ThreadPriorityManager priorityManager;
FeedEngine feedEngine(feSettings, &priorityManager);

Decreasing Time Handler Spends on Waiting for Incoming Data

When the Handler accomplishes processing of the previously received market data, it initiates reception of new incoming data if it is available in the feed. In reality, data packets do not come each after another, but there's a certain time interval between two neighbor packets. Pauses between incoming packets cause the Feed Engine bound to the Handler to suspend activities and sleep in a kernel while waiting for incoming data. As a result, data and an execution code can be ejected from a processor's cache. That brings to the fact that the next iteration of processing loop is performed slower compare to the previous one. The Feed Engine provides the OnixS::CME::Streamlined::MDH::FeedEngineSettings::dataWaitTime parameter, which value defines the time the Feed Engine spends in I/O operation while waiting for incoming data. Reducing a value of the noted parameter increases the number of wake-ups, and thus reduces the probability for the Feed Engine to be thrown out of a processor's cache. If the parameter is set to zero, the Feed Engine just checks for data availability, but does not enter into a kernel for sleep. The drawback of reducing waiting time is an increase of processor consumption (up to 100% in case of zero parameter value).

Suppressing Market Data Copying

Under normal conditions, the Handler effectively utilizes internal structures used to keep incoming market data. Packets and messages are re-used once the Handler processes the contained data. Therefore, no data is allocated during a real-time market data processing.

However, data may be copied within callbacks, which the Handler invokes as listeners for various market data events. Thus, when a book is copied, that assumes memory allocation and thus has a negative effect on performance and latency. Copying should be minimized or strategies with preallocation should be used to improve the results.

For example, book snapshots can be constructed of a certain capacity capable of storing a particular number of price levels. Constructing book snapshots with initial capacity, sufficient to hold books of maximal possible depth, eliminates reallocations.