OnixS C++ CME Market Data Handler  4.4.1
API documentation
Low Latency Best Practices [UPDATED]

Inner Contents

 Benchmarking Results [UPDATED]
 

Detailed Description

Given topic uncovers how to configure Handler in order to achieve maximal performance characteristics and lowest processing latency.

Configuring Logging Subsystem

Under normal conditions Handler logs important events and market data transmitted by MDP into log file. As far as logging entries represent textual information, binary data like incoming market data packets are encoded using base64-encoding before stored in log. That adds extra time into processing cycle. Finally, if Logger implementation stores its data into a file, that may be a relatively slow operation.

To eliminate slowdowns caused by flushing data to filesystem and/or extra encoding operations, logging can be disabled by binding instance of OnixS::CME::MDH::NullLogger to the Handler. In such case log events are not constructed by the Handler and nothing is logged at all.

Turning Up Working Threads

Market data processing is done asynchronously by using working threads. Under normal conditions, threads may be executed on any processor available in the system. That may have negative influence onto overall performance due to unnecessary thread context switching.

To avoid switching threads between processors, Feed Engine as manager of working threads, allows to establish processor affinity for each working thread:

FeedEngineSettings feSettings1;
FeedEngineSettings feSettings2;
// Scheduling Feed Engines to use different processors.
feSettings1.threadAffinity().insert(1);
feSettings2.threadAffinity().insert(2);

In addition to ability to manipulate thread affinity for working threads of Feed Engine, it also provides a set of events triggered by working threads at the beginning of processing and before ending processing loop. See Feed Engine Events for more information.

With help of working threads events, it's possible to perform more advanced threads turning like updating thread priority:

using namespace OnixS::CME::MDH;
struct ThreadPriorityManager : FeedEngineListener
{
void onFeedEngineThreadBegin(const FeedEngine&)
{
SetThreadPriority(
GetCurrentThread(),
THREAD_PRIORITY_TIME_CRITICAL);
}
};
FeedEngineSettings feSettings;
ThreadPriorityManager priorityManager;
FeedEngine feedEngine(feSettings, &priorityManager);

Decreasing Time Handler Spends on Waiting for Incoming Data

When Handler accomplishes processing of previously received market data, it initiates reception of new incoming data if it's available in the feed. In reality, data packets do not come each after another, but there's a certain time interval between two neighbor packets. Pauses between incoming packets causes Feed Engine binded to the Handler to suspend activities and sleep in kernel while waiting for incoming data. As a result, data and execution code can be ejected from processor's cache. That brings to the fact next iteration of processing loop will be performed slower in compare to previous one. To have an influence on the amount of time Feed Engine waits in kernel on incoming data, it provides OnixS::CME::MDH::FeedEngineSettings::dataWaitTime parameter, whose value defines time Feed Engine spends in I/O operation while waiting for incoming data. Reducing value of noted parameter increases number of wake-ups and thus reduces probability for Feed Engine to be thrown out of processor's cache. If parameter is set to zero, Feed Engine just checks for data available, but does not enter kernel for a sleep. The drawback of reducing waiting time is an increase of processor consumption (up to 100% in case of zero parameter value).

Suppressing Market Data Copying

Under normal conditions, Handler effectively utilizes internal structures used to keep incoming market data. Packets and FIX messages are re-used once contained data is processed by Handler. Therefore, no data is allocated during real-time market data processing.

However, data may be copied within callbacks Handler invokes as listeners to various market data events. Thus, when book is copied, that invokes memory allocation and thus has negative effect onto performance and latency. To improve the results, copying should be minimized or strategies with preallocation should be used.

For example, book snapshots can be constructed of certain capacity capable to store particular number of price levels. Constructing book snapshots with initial capacity sufficient to hold books of maximal possible depth will eliminate reallocations.