The default benchmarking procedure descrived in the Benchmarking Results topic measures application-level latency and does not take into consideration time the incoming data spends in the network/kernel layer. The given approach lets analyzing critical path inside the processing machinery (Handler). However, an application may not receive incoming data as soon as it arrives. Thus, measured latency does not reflect the other important aspects like time data spends in the network adapter or a kernel.
In order to improve the analysis, the support of hardware timestamps has been added to the SDK. Hardware timestamps are assigned to the incoming packets as soon as they arrive at the network adapter. If hardware timestamps are taken as the initial point of the latency measurement, that lets to obtain a timespan starting from the moment the data arrived to the network card and to the moment the results of data processing are delivered to the user.
The following tables depict the results of benchmarking using the regular (application-only) approach and compares them with the results obtained with the help of hardware timestamps. Also, the measurements were taken for the two major implementations of the Feed Engine machinery exposed by the SDK and encapsulated into OnixS::CME::ConflatedUDP::SocketFeedEngine and OnixS::CME::ConflatedUDP::SolarflareFeedEngine classes which use ordinary sockets and the Solarflare ef_vi SDK correspondently. Additionally, a socket-based feed engine was benchmarked in the OpenOnload environment to depict the actual benefits of using the OpenOnload for ordinary (socket-based) solutions.
Statistics | Latency (μs) | |
---|---|---|
SocketFeedEngine | SolarflareFeedEngine | |
Minimal | 0.993 | 0.856 |
Median | 1.144 | 1.017 |
Mean | 1.203 | 1.071 |
95% | 1.493 | 1.383 |
99% | 2.047 | 1.830 |
Maximal | 14.344 | 14.680 |
Statistics | Latency (μs) | |
---|---|---|
SocketFeedEngine | SolarflareFeedEngine | |
Minimal | 14.007 | 2.190 |
Median | 52.087 | 2.664 |
Mean | 48.853 | 2.709 |
95% | 55.894 | 3.152 |
99% | 60.966 | 3.798 |
Maximal | 79.933 | 16.385 |
Statiscits | Latency (μs) | ||
---|---|---|---|
SocketFeedEngine | SocketFeedEngine+Openonload | SolarflareFeedEngine | |
Minimal | 0.507 | 0.533 | 0.600 |
Median | 0.667 | 0.693 | 0.714 |
Mean | 0.739 | 0.765 | 0.778 |
95% | 1.156 | 1.113 | 1.067 |
99% | 1.564 | 1.560 | 1.490 |
Maximal | 14.019 | 14.617 | 14.756 |
Statiscits | Latency (μs) | ||
---|---|---|---|
SocketFeedEngine | SocketFeedEngine+Openonload | SolarflareFeedEngine | |
Minimal | 10.786 | 3.235 | 1.925 |
Median | 12.610 | 4.148 | 2.353 |
Mean | 12.676 | 4.267 | 2.377 |
95% | 14.238 | 5.260 | 2.853 |
99% | 15.562 | 9.407 | 3.310 |
Maximal | 64.169 | 22.261 | 16.210 |
Using hardware timestamps brings kernel/hardware latency into overall measurements and thus allows performing analysis more precisely. For example, as depicted by the tables above, application-only latency is similar to the major implementations of the Feed Engine exposed by the SDK. However, using hardware timestamps shows the real benefits of using kernel-bypass solutions like the OpenOnload is. Moreover, it depicts in numbers the advantage of using specialized solutions like Solarflare ef_vi SDK while working with multicast data.