OnixS C++ CME MDP Conflated UDP Handler 1.1.2
API documentation
Loading...
Searching...
No Matches
Benchmarking with Network/Kernel Layer

Preface

The default benchmarking procedure descrived in the Benchmarking Results topic measures application-level latency and does not take into consideration time the incoming data spends in the network/kernel layer. The given approach lets analyzing critical path inside the processing machinery (Handler). However, an application may not receive incoming data as soon as it arrives. Thus, measured latency does not reflect the other important aspects like time data spends in the network adapter or a kernel.

In order to improve the analysis, the support of hardware timestamps has been added to the SDK. Hardware timestamps are assigned to the incoming packets as soon as they arrive at the network adapter. If hardware timestamps are taken as the initial point of the latency measurement, that lets to obtain a timespan starting from the moment the data arrived to the network card and to the moment the results of data processing are delivered to the user.

The following tables depict the results of benchmarking using the regular (application-only) approach and compares them with the results obtained with the help of hardware timestamps. Also, the measurements were taken for the two major implementations of the Feed Engine machinery exposed by the SDK and encapsulated into OnixS::CME::ConflatedUDP::SocketFeedEngine and OnixS::CME::ConflatedUDP::SolarflareFeedEngine classes which use ordinary sockets and the Solarflare ef_vi SDK correspondently. Additionally, a socket-based feed engine was benchmarked in the OpenOnload environment to depict the actual benefits of using the OpenOnload for ordinary (socket-based) solutions.

Application-Level Only With 5 Milliseconds Delay Between Packets

StatisticsLatency (μs)
SocketFeedEngineSolarflareFeedEngine
Minimal0.9930.856
Median1.1441.017
Mean1.2031.071
95%1.4931.383
99%2.0471.830
Maximal14.34414.680

With Network/Kernel-Layer And 5 Milliseconds Delay Between Packets

StatisticsLatency (μs)
SocketFeedEngineSolarflareFeedEngine
Minimal14.0072.190
Median52.0872.664
Mean48.8532.709
95%55.8943.152
99%60.9663.798
Maximal79.93316.385

Application-Level Only Without Any Delay Between Packets

StatiscitsLatency (μs)
SocketFeedEngineSocketFeedEngine+OpenonloadSolarflareFeedEngine
Minimal0.5070.5330.600
Median0.6670.6930.714
Mean0.7390.7650.778
95%1.1561.1131.067
99%1.5641.5601.490
Maximal14.01914.61714.756

With Network/Kernel-Layer And Without Any Delay Between Packets

StatiscitsLatency (μs)
SocketFeedEngineSocketFeedEngine+OpenonloadSolarflareFeedEngine
Minimal10.7863.2351.925
Median12.6104.1482.353
Mean12.6764.2672.377
95%14.2385.2602.853
99%15.5629.4073.310
Maximal64.16922.26116.210

Conclusions and Important Notes

Using hardware timestamps brings kernel/hardware latency into overall measurements and thus allows performing analysis more precisely. For example, as depicted by the tables above, application-only latency is similar to the major implementations of the Feed Engine exposed by the SDK. However, using hardware timestamps shows the real benefits of using kernel-bypass solutions like the OpenOnload is. Moreover, it depicts in numbers the advantage of using specialized solutions like Solarflare ef_vi SDK while working with multicast data.