Fast reaction times – why end-to-end latency matters in autonomous driving

Autonomous vehicles have to constantly perceive the environment around them, identify where its safe to drive and where other vehicles or pedestrians are located around the vehicle. Despite the algorithmic performance required to master this task, the end-to-end response time from sensor data acquisition to the initiation of a suitable response is equally important.

Human reaction times can be up to 0.5 seconds (from occurrence of the event until the braking maneuver is initiated). As different applications have varying operating speeds, the impact of this reaction time also varies. On a typical yard application with vehicles operating at 25kph, this corresponds to a driven distance of approximately 3.5m. In highway situations, with trucks driving at 80kph, this is equivalent to a driven distance of 11.1m. These values show that the impact of reaction times on the stopping distance is significant.

When designing an autonomous driving stack, developers strive to achieve super-human performance. This leads to significantly lower requirements in terms of reaction times, with some applications targeting end-to-end latencies as low as 50 to 100 milliseconds (including perception, planning and driving functions). There are multiple factors contributing influencing the low choice of this value: 

First, computers process tasks sequentially. This leads to the worst-case response time being roughly two times the end-to-end latency of the software stack. The reason for this is, that if an event happens right after a sensor captured a certain dataframe and the processing pipeline in the autonomous driving software is started, the response to this can only be initiated after the next dataframe is captured and a new processing pipeline is initiated.

Second, the representation of the behavior of other participants in the scene (such as vehicles or pedestrians) is usually non-perfect. There are two mitigation strategies to this: The most intuitive option is to enhance the prediction model via additional training data or more complex algorithms. However, this often leads to an increase in computational requirements and therefore execution time. In contrast, more frequent observations can also lead to improved predictions and additionally decrease the response time to unmodelled effects. As a result, the resulting system behavior might outperform the more complex model due to the decreased response time.

These reasons contribute to the significantly lower values for the maximum acceptable end-to-end latencies in the autonomous driving stack. In addition to further advances in compute capabilities, e.g. via increased single core performance and utilization of hardware acceleration, there is significant potential on the software side.

The driveblocks mapless autonomy SDK leverages this potential via the deep integration of parallelization strategies in all layers of the software stack. One of the prominent examples are the perception software components: They are built to realize this parallelization by allowing each sensor data stream to be processed individually. This allows for easy distribution of the workload on multiple compute cores and combining the information in a light-weight fashion at the end of the detection pipeline. Other algorithms in the software stack are developed such that they allow parallelization within the algorithm itself.

In addition to this, we have made execution time monitoring and analysis a core component of the development workflow. This includes the analysis of potential algorithmic improvements, the consideration of real-time requirements in the runtime components, as well as the utilization of shared memory transport to minimize communication overhead between different parts of the software stack.