OnixS C++ CME Market Data Handler  5.4.0
API documentation
Replaying Logs, PCAPs CME DataMine/Historical Data

Replaying Log Files

Under normal flow, the Handler logs all essential aspects of its execution. Logged events also include the original market data (packets) processed by the Handler. Development and support teams may use this information during investigations of market data processing issues. However, log files can be used to reproduce the original Handler's behavior during the live processing of the recorded data.

The Handler can extract market data from a log file only if the instance of the OnixS::CME::MDH::FileLogger class produced that file. The replay of market data from the user-defined log files is not supported. Instead, the users are free to implement the Feed Engine abstraction, which extracts data from log files in their formats. See The Feed Engine Concept for more information.

Warning
To get market data recorded into a log file, the OnixS::CME::MDH::FileLoggerSettings::severityLevel parameter must be set to either the OnixS::CME::MDH::LogSeverity::Regular or the OnixS::CME::MDH::LogSeverity::Debug value. Otherwise, Handler won't trigger market-related events due to the absence of a source for those events. Various settings related to which debug information is to be logged by the Handler while processing market data have no any influence onto replay.

The OnixS::CME::MDH::replayLogFiles function does the market data replay. It accepts the list of log files to be replayed and a set of Handlers that will process the data stored in the given logs.

If log files were recorded using the SDK naming conventions and the OnixS::CME::MDH::makeLogFilename function in particular, then the file list can be fulfilled using utility routines provided by the SDK.

FileList logs;
// Collects all log files recorded for channel 310 and stored in Data folder.
gatherLogFiles(logs, 310, "Data");
replayLogFiles(logs, handler);

Alternatively, users may assign a list of log files manually. A manually fulfilled list of log files allows replaying log files which names differ from the names produced by the OnixS::CME::MDH::makeLogFilename utility function.

FileList logs;
logs.push_back("1.txt");
logs.push_back("2.txt");
logs.push_back("3.txt");
Note
The SDK doesn't limit the amount of data to be replayed as well as there're no restrictions for the file size except those defined by an operating system. The only requirement is the files must be in exact order as they were recorded by the logger in bounds of a single processing session.

The log replay feature's critical requirement is that files must form continuous recording of a single processing session. For example, suppose the Handler is launched each morning and stopped each evening during the trading week. Files produced by a logger for a single day refer to a single processing session. Thus, these files can be used all together as a solid list of files for replay. At the same time, files recorded on different days refer to different processing sessions and can't be replayed at once.

Note
As noted, the replay machinery can't replay at once files referring to different processing sessions under ordinary conditions. However, if files were recorded in bounds of a single trading week, which usually starts on Sunday and ends on Friday, it's possible to replay them at once, although with some limitations. The Additional Settings Affecting Log Replay uncovers the details on how to customize the replay.

Replaying Logs Containing Data From Multiple Channels

The logging services allow using the same instance of the Logger for multiple instances of the Handler. Thus a single log may contain market data belonging to multiple channels. The enhancements made to the recent releases make it possible to replay such log files for multiple instances of the Handler.

The following code snippet depicts how to replay a log file containing data recorded by two Handlers for channel 310 and 312:

void configure(HandlerSettings& settings, ChannelId channel)
{
settings.channel(channel);
...
}
...
FileList logs;
logs.push_back("LogForAllHandlers.txt");
Handler handler310, handler312;
configure(handler310.settings(), 310);
configure(handler312.settings(), 312);
Handler* handlers[] = { &handler310, &handler312 };
const size_t handlerQty = staticArrayLength(handlers);
replayLogFiles(logs, handlers, handlerQty);
Note
The channel identifier must be unique in bounds of the collection of the Handlers passed to replay.

Additional Settings Affecting Log Replay

The Handler performs market data processing according to its settings. The kind of market data logged into log files and its order depends on the Handler's session settings while producing the log files. Suppose the Handler was configured to perform start-up recovery due to mid-week late join. As a result, the instrument and market (snapshot) recovery data is put into a log file before the Handler switches to real-time incremental data processing. Also, if the Handler is configured to recover market state from snapshots each time any packet is lost, the snapshot recovery loops will be recorded for each detected gap.

Therefore, by default, the log replay extracts parameters of a processing session from the given log files and updates instances of the Handler participating in the replay for the replay time. However, the SDK allows to override the default behavior and let log replay process logged market data according to user-defined settings.

Primary settings affecting the behavior of the log replay are gathered into the OnixS::CME::MDH::LogReplaySupplements class. The following table uncovers available settings and how those settings affect the log replay functionality.

Parameter Description
OnixS::CME::MDH::LogReplaySupplements::settingsUse

The given parameter defines the way the log replay handles the processing session parameters for the Handlers participating in the log replay.

In particular, if the parameter is set to OnixS::CME::MDH::HandlerSettingsUse::Suggested value, the log replay extracts parameters for the processing session from the log files and temporarily updates each instance of the Handler participating in the replay. This is the default value.

Alternatively, if the parameter is set to OnixS::CME::MDH::HandlerSettingsUse::AsIs value, the log replay does not modify parameters of processing session for any Handler instance. In this mode, the processing flow may differ from that was observed during the original session (whose data is replayed).

OnixS::CME::MDH::LogReplaySupplements::aliases

The replay machinery is based on feed id matching. The replay engine pushes extracted market data to the feeds used by the Handler participating in the replay if the data source identifier matches to the feed identifier.

However, sometimes there's a need for having more flexibility in matching. For example, a log file may contain data for a different channel.

Therefore, to satisfy the needs, the given parameter allows customizing data source matching. It represents a set of aliases for data sources. The given parameter tells the replay engine to use predefined matching instead of direct correspondence.

OnixS::CME::MDH::LogReplaySupplements::timeSpan By default, the log replay processes all records from the log files to be replayed. The given parameter allows defining time interval for which the logged data must be replayed. Entries, whose logging time is out of the given time span, are skipped by the replay machinery.
OnixS::CME::MDH::LogReplaySupplements::speed The given parameters allows controlling the speed with which market data is replayed. By default, the log replay extracts and pushes market data to the Handler without any delays. Therefore, recorded data is replayed faster than it was processed during the original session. However, it is possible to override the given behavior and to tell the replay machinery to replay market data with the same speed as it was processed during the original session.

The following sample depicts how to set up configuration to replay a log file recorded for a different channel and using the different session settings.

// This tells to do not read processing settings from logs.
supplements.settingsUse(HandlerSettingsUse::AsIs);
// This maps both incremental feeds of the channel 310
// to the single incremental feed A of the channel 312.
supplements.aliases()["310IA"] = "312IA";
supplements.aliases()["310IB"] = "312IA";
// The Handler.
Handler handler;
handler.settings().channel(312);
// Configures the Handler to natural refresh processing.
setSessionToNaturalRefresh(handler.settings().session());
// Files to be replayed.
FileList logs;
gatherLogFiles(logs, 310, "Data");
// Now, replays gathered logs.
replayLogFiles(logs, handler, supplements);

Event Notifications While Replaying Log Files

From the event notification perspective, there's no difference whether the Handler processes market received by the network or from a log file. This is because implementation of the log replay functionality is based on the same concept of the Feed Engine and internally uses own Feed Engine which extracts data not from network sources, but from the given log files.

Note
During the replay, the OnixS::CME::MDH::PacketArgs::receiveTime member returns original time when market data packet was obtained from the network.

Replaying Packet Captures

Like the log replay, the 5th major release allows replaying market data stored as network packet captures (the .PCAP files). The API and the replay machinery are very similar to the log replay to provide users with the best experience.

The following table summarizes API available for PCAP replay:

Parameter Description
OnixS::CME::MDH::gatherFiles

Gathers files which are stored in the given folder with the given extension. Gathered files are sorted by name.

Note
In contrast to the OnixS::CME::MDH::gatherLogFiles function gathering log files, which collects log files in exact order the files were produced by a logger, the given one represents a generic routine finding files according to the given pattern (extension).
OnixS::CME::MDH::replayPcapFiles Replays the given list of .PCAP files for the given set of the Handlers.
Warning
In contrast to log files, PCAP files do not include Handler's settings used to process recorded market data. As a result, Handlers must be explicitly configured. Also, the current implementation is limited to replay incremental data only. Therefore, users must configure the Handler instances participating in the replay explicitly to process market data in natural refresh without any live recovery capabilities.

Recovering Instrument Definitions While Replaying PCAP Files

The PCAP replay subsystem processes data from incremental feeds only. Usually, the MDP transmits instrument definitions through the incremental feeds only at the beginning of the week. Thus, if capturing was performed later the week, no instrument definitions may be present in the recorded packets. The absence of a security definition causes the Handler to use default values for direct and implied books' depths. Therefore, relevant values must be established for the corresponding parameters. Otherwise, order book maintenance issues may take place during the replay.

Note
Default depths for the direct and implied books can be accessed by the following paths: handler.settings().bookManagement().directBooks().defaultDepth() and handler.settings().bookManagement().impliedBooks().defaultDepth().

The replay subsystem was improved to avoid issues in the MBP book maintenance due to the lack of instrument definitions in the replayed data. It allows recovering instrument definitions from a previously recorded cache file or a 'secdef.dat' downloaded from the CME FTP. In such a case, a processing session is to be configured to recover instruments at the start-up (join) stage, and a path to a file containing instrument definitions must be defined:

Handler handler;
HandlerSettings& settings = handler.settings();
settings.session().joinRecovery(JoinRecoveryOptions::Instruments);
settings.instrumentCache("secdef.dat");

Having instrument definitions while replaying market data from PCAP files is also essential for security filtering. As known, security filtering allows selecting instruments based on their attributes like id, security group, symbol, and asset. However, only security id represents a primary attribute as MDP uses it to link various data like order book update with instruments. Other attributes like symbols or assets are retrieved from an instrument definition. Therefore, the lack of instrument definitions narrows the filtering capabilities. Only filtering by security id will functioning correctly. Filtering using any other attribute like symbol, group or asset, will not work for securities whose definitions are not available during data replay.

Packet Captures Matching & Aliasing

The PCAP replay engine extracts multicast group information from a captured packet and dispatches it to a feed with the same multicast group. This is the way the replay subsystem matches packets and feeds. However, sometimes there's a need for having more flexibility in matching packets to feeds. The PCAP replay control parameters allow affecting the way data matching is performed. The OnixS::CME::MDH::PcapReplaySupplements::aliases member exposes an instance of the OnixS::CME::MDH::NetAddressAliases class allowing to redirect data from one source to another one at replay time.

Suppose captured packets belong to the production environment while the Handler is configured with the certification environment's connectivity configuration. The production and certification environments use different multicast groups to serve the same channels to avoid simultaneous data transmission conflicts. Thus, replaying data belonging to the production environment using certification connectivity configuration will lead to nothing will happen from the user's perspective. The Handler will trigger no events just because the replay subsystem will find no data for certification feeds. Users must define source aliases to make the replay subsystem considering production feeds as certification ones. The following code snippet depicts how to solve the described case:

PcapReplaySupplements supplements;
supplements.aliases()
[
// Multicast group for primary incremental feed
// of the channel 310 in the production environment.
NetFeedConnection("224.0.31.1", 14310)
] =
// Multicast group for the same feed (primary incremental,
// channel 310) but belonging to the certification environment.
NetFeedConnection("224.0.28.1", 14310);

Replaying CME DataMine Files

CME DataMine is the official source of market data for CME Group markets. CME exposes a wide range of data types, including Market Depth, End-of-Day, Block Trades, etc. The SDK allows customers to run their solutions on realistic data by providing the ability to replay historical data from CME DataMine.

The table below exposes functions available for the users:

Parameter Description
OnixS::CME::MDH::gatherFiles

Gathers files which are stored in the given folder with the given extension. Gathered files are sorted by name.

Note
In contrast to the OnixS::CME::MDH::gatherLogFiles function gathering log files, which collects log files in exact order the files were produced by a logger, the given one represents a generic routine finding files according to the given pattern (extension).
OnixS::CME::MDH::replayDatamineFiles Replays the given list of CME Datamine files for the given set of the Handlers.
Note
From the user's perspective, the DataMine data replay is close to the PCAP replay in terms of limitations and behavior customization. Therefore, it's highly recommended to get familiar with the major aspects of the PCAP replay functionality.
Warning
Due to the same reasons as in the case of PCAP data replay, replaying historical data from the CME DataMine is limited to process incremental data only. Therefore, Handler's instances participating in replay must be configured to process market data in natural refresh without any recovery capabilities (except instrument definition recovery from an instrument cache).

The critical aspect of the historical data replay feature is related to the kind of data supported. CME DataMine offers historical data in various formats, including FIX and market data packet captures. The replay functionality accepts historical data as market data packet captures only. The Handler's processing engine is built over SBE binary structures to gain maximal performance. Also, replay functionality simulates data reception and raises events related to packet handling. FIX messages do not contain information stored in packets. Therefore, they can't be used as a source of data for replay.

Warning
The CME calls its format for storing historical data as packet captures. However, it's not the same as network packet captures containing raw network attributes like IP headers, etc. The CME Packet Capture dataset represents a different binary format. Thus, the CME historical data as packet captures can't be replayed by the OnixS::CME::MDH::replayPcapFiles function. For this reason the SDK exposes an aditional function designed to extract data from files in this particular format.

See CME Packet Capture Dataset for more information.

Replay Sample

A sample application demonstrating the Log Replay functionality can be found in the samples/Replay sub-folder of the SDK distribution package.