Under normal flow, the Handler logs all essential aspects of its execution. Logged events also include the original market data (packets) processed by the Handler. Development and support teams may use this information during investigations of market data processing issues. However, log files can be used to reproduce the original Handler's behavior during the live processing of the recorded data.
The Handler can extract market data from a log file only if the instance of the OnixS::CME::ConflatedUDP::FileLogger class produced that file. The replay of market data from the user-defined log files is not supported. Instead, the users are free to implement the Feed Engine abstraction, which extracts data from log files in their formats. See The Feed Engine Concept for more information.
The OnixS::CME::ConflatedUDP::replayLogFiles function does the market data replay. It accepts the list of log files to be replayed and a set of Handlers that will process the data stored in the given logs.
If log files were recorded using the SDK naming conventions and the OnixS::CME::ConflatedUDP::makeLogFilename function in particular, then the file list can be fulfilled using utility routines provided by the SDK.
Alternatively, users may assign a list of log files manually. A manually fulfilled list of log files allows replaying log files which names differ from the names produced by the OnixS::CME::ConflatedUDP::makeLogFilename utility function.
The log replay feature's critical requirement is that files must form continuous recording of a single processing session. For example, suppose the Handler is launched each morning and stopped each evening during the trading week. Files produced by a logger for a single day refer to a single processing session. Thus, these files can be used all together as a solid list of files for replay. At the same time, files recorded on different days refer to different processing sessions and can't be replayed at once.
The logging services allow using the same instance of the Logger for multiple instances of the Handler. Thus a single log may contain market data belonging to multiple channels. The enhancements made to the recent releases make it possible to replay such log files for multiple instances of the Handler.
The following code snippet depicts how to replay a log file containing data recorded by two Handlers for channel 310 and 312:
The Handler performs market data processing according to its settings. The kind of market data logged into log files and its order depends on the Handler's session settings while producing the log files. Suppose the Handler was configured to perform start-up recovery due to mid-week late join. As a result, the instrument and market (snapshot) recovery data is put into a log file before the Handler switches to real-time incremental data processing. Also, if the Handler is configured to recover market state from snapshots each time any packet is lost, the snapshot recovery loops will be recorded for each detected gap.
Therefore, by default, the log replay extracts parameters of a processing session from the given log files and updates instances of the Handler participating in the replay for the replay time. However, the SDK allows to override the default behavior and let log replay process logged market data according to user-defined settings.
Primary settings affecting the behavior of the log replay are gathered into the OnixS::CME::ConflatedUDP::LogReplaySupplements class. The following table uncovers available settings and how those settings affect the log replay functionality.
Parameter | Description |
---|---|
OnixS::CME::ConflatedUDP::LogReplaySupplements::settingsUse | The given parameter defines the way the log replay handles the processing session parameters for the Handlers participating in the log replay. In particular, if the parameter is set to OnixS::CME::ConflatedUDP::HandlerSettingsUse::Suggested value, the log replay extracts parameters for the processing session from the log files and temporarily updates each instance of the Handler participating in the replay. This is the default value. Alternatively, if the parameter is set to OnixS::CME::ConflatedUDP::HandlerSettingsUse::AsIs value, the log replay does not modify parameters of processing session for any Handler instance. In this mode, the processing flow may differ from that was observed during the original session (whose data is replayed). |
OnixS::CME::ConflatedUDP::LogReplaySupplements::aliases | The replay machinery is based on feed id matching. The replay engine pushes extracted market data to the feeds used by the Handler participating in the replay if the data source identifier matches to the feed identifier. However, sometimes there's a need for having more flexibility in matching. For example, a log file may contain data for a different channel. Therefore, to satisfy the needs, the given parameter allows customizing data source matching. It represents a set of aliases for data sources. The given parameter tells the replay engine to use predefined matching instead of direct correspondence. |
OnixS::CME::ConflatedUDP::LogReplaySupplements::timeSpan | By default, the log replay processes all records from the log files to be replayed. The given parameter allows defining time interval for which the logged data must be replayed. Entries, whose logging time is out of the given time span, are skipped by the replay machinery. |
OnixS::CME::ConflatedUDP::LogReplaySupplements::speed | The given parameters allows controlling the speed with which market data is replayed. By default, the log replay extracts and pushes market data to the Handler without any delays. Therefore, recorded data is replayed faster than it was processed during the original session. However, it is possible to override the given behavior and to tell the replay machinery to replay market data with the same speed as it was processed during the original session. |
The following sample depicts how to set up configuration to replay a log file recorded for a different channel and using the different session settings.
From the event notification perspective, there's no difference whether the Handler processes market received by the network or from a log file. This is because implementation of the log replay functionality is based on the same concept of the Feed Engine and internally uses own Feed Engine which extracts data not from network sources, but from the given log files.
Like the log replay, the 5th major release allows replaying market data stored as network packet captures (the .PCAP files). The API and the replay machinery are very similar to the log replay to provide users with the best experience.
The following table summarizes API available for PCAP replay:
Parameter | Description |
---|---|
OnixS::CME::ConflatedUDP::gatherFiles | Gathers files which are stored in the given folder with the given extension. Gathered files are sorted by name.
|
OnixS::CME::ConflatedUDP::replayPcapFiles | Replays the given list of .PCAP files for the given set of the Handlers. |
The PCAP replay subsystem processes data from incremental feeds only. Usually, the MDP transmits instrument definitions through the incremental feeds only at the beginning of the week. Thus, if capturing was performed later the week, no instrument definitions may be present in the recorded packets. The absence of a security definition causes the Handler to use default values for direct books' depth. Therefore, relevant values must be established for the corresponding parameters. Otherwise, order book maintenance issues may take place during the replay.
handler.settings().bookManagement().directBooks().defaultDepth()
.The replay subsystem was improved to avoid issues in the MBP book maintenance due to the lack of instrument definitions in the replayed data. It allows recovering instrument definitions from a previously recorded cache file or a 'secdef.dat' downloaded from the CME FTP. In such a case, a processing session is to be configured to recover instruments at the start-up (join) stage, and a path to a file containing instrument definitions must be defined:
Having instrument definitions while replaying market data from PCAP files is also essential for security filtering. As known, security filtering allows selecting instruments based on their attributes like id, security group, symbol, and asset. However, only security id represents a primary attribute as MDP uses it to link various data like order book update with instruments. Other attributes like symbols or assets are retrieved from an instrument definition. Therefore, the lack of instrument definitions narrows the filtering capabilities. Only filtering by security id will functioning correctly. Filtering using any other attribute like symbol, group or asset, will not work for securities whose definitions are not available during data replay.
The PCAP replay engine extracts multicast group information from a captured packet and dispatches it to a feed with the same multicast group. This is the way the replay subsystem matches packets and feeds. However, sometimes there's a need for having more flexibility in matching packets to feeds. The PCAP replay control parameters allow affecting the way data matching is performed. The OnixS::CME::ConflatedUDP::PcapReplaySupplements::aliases member exposes an instance of the OnixS::CME::ConflatedUDP::NetAddressAliases class allowing to redirect data from one source to another one at replay time.
Suppose captured packets belong to the production environment while the Handler is configured with the certification environment's connectivity configuration. The production and certification environments use different multicast groups to serve the same channels to avoid simultaneous data transmission conflicts. Thus, replaying data belonging to the production environment using certification connectivity configuration will lead to nothing will happen from the user's perspective. The Handler will trigger no events just because the replay subsystem will find no data for certification feeds. Users must define source aliases to make the replay subsystem considering production feeds as certification ones. The following code snippet depicts how to solve the described case:
CME DataMine is the official source of market data for CME Group markets. CME exposes a wide range of data types, including Market Depth, End-of-Day, Block Trades, etc. The SDK allows customers to run their solutions on realistic data by providing the ability to replay historical data from CME DataMine.
The table below exposes functions available for the users:
Parameter | Description |
---|---|
OnixS::CME::ConflatedUDP::gatherFiles | Gathers files which are stored in the given folder with the given extension. Gathered files are sorted by name.
|
OnixS::CME::ConflatedUDP::replayDatamineFiles | Replays the given list of CME Datamine files for the given set of the Handlers. |
The critical aspect of the historical data replay feature is related to the kind of data supported. CME DataMine offers historical data in various formats, including FIX and market data packet captures. The replay functionality accepts historical data as market data packet captures only. The Handler's processing engine is built over SBE binary structures to gain maximal performance. Also, replay functionality simulates data reception and raises events related to packet handling. FIX messages do not contain information stored in packets. Therefore, they can't be used as a source of data for replay.
See CME Packet Capture Dataset for more information.
A sample application demonstrating the Log Replay functionality can be found in the samples/Replay sub-folder of the SDK distribution package.