Machine Learning in Sensors for Collision Avoidance

Erkan Karakus, Tao Wei, and Qing Yang
Dept. of Electrical, Computer, and Biomedical Engineering
University of Rhode Island
Kingston, RI, 02881 USA
{erkan_karakus, tao_wei, qyang}@uri.edu

Abstract—Sensors generate a huge amount of data that need to be transferred to a computing device for processing. Such large data transfer takes time and consumes energy. This paper presents a new sensing and computing architecture, referred to as MLIS (Machine Learning in Sensors). MLIS allows a part of machine learning to be done on sensor board thereby dramatically reducing the amount of data transferred to the computing device and hence improving overall system performance and energy efficiency. Using an energy-based probabilistic graphical model, RBM (Restricted Boltzmann Machine), we built a new ADAS (Advanced Driver-Assistance System) computing platform for autonomous driving with phased-array-radar as sensors. A working prototype has been built to provide proof of concept for our new architecture. The prototype is implemented using a TI’s mmWave (millimeter Wave) radar board and a Vivado HLS implementation of the RBM on the Xilinx xc7z020-clg400-1 device. Extensive experiments have been carried out using the prototype on realistic scenes on our campus. Experimental results have shown that the proposed architecture can reduce the data to be transferred by a factor of 8 while maintaining 98% accuracy. Based on the experimental settings, we present two case studies that have shown a remarkable reduction in collision probability if applying the new architecture to autonomous vehicles.

Keywords—In-sensor computing, computer architecture, mmWave radar, deep learning, object detection and identification

I. INTRODUCTION

In today’s digital world, sensors generate a huge amount of data in a variety of applications. This huge amount of data is generally transferred to the DRAM of computing systems for processing through networks or direct connections. Transferring such a huge amount of data consumes energy and results in long transfer delays. In real-time applications, especially in mission-critical real-time applications such as autonomous vehicles, any delay over the hard deadline implies a life and death situation. Therefore, minimizing data transfer and hence latency is extremely important.

Internal communication and computing in an autonomous vehicle should be designed to provide fault tolerance, energy efficiency, determinism, high bandwidth, and flexibility [1]. End-to-end latency and communication overhead play a critical role to ensure data consistency and temporal determinism across functional cause-effect chains [2], [3].

This paper presents a new sensing, communication, and computing architecture for autonomous vehicles. The new architecture, referred to as MLIS (Machine Learning in Sensors), leverages an energy-based generative model, RBM (Restricted Boltzmann Machine). MLIS involves three major steps: First, sensed raw data go through the first layers of RBM as a “generate phase” on the sensor board. Second, the outputs of the hidden layer units of RBM are transmitted to the central ADAS computer. Third, ADAS computer carries out the phase of 1-step Gibbs sampling as the “reconstruct phase”. Because of in-sensor computing of the generate phase, the data transferred from the sensor board to the DRAM of the central ADAS computer is reduced, giving rise to dramatically improved performance and energy efficiency of MLIS.

After establishing the accuracy of MLIS, we built a working prototype using Texas Instruments AWR1243 FMCW mmWave radar board and Xilinx xc7z020-clg400-1 device. Extensive experiments have been carried out using our prototype MLIS on practical scenes on our campus for object detection and decision making process. Experimental results have shown that MLIS yielded a reconstruction accuracy of 98% accuracy, dimensionality reduction of a factor of 8 and reduced the point-to-point data transfer latency by a factor of 6. To demonstrate how such latency reduction helps improve collision avoidance in autonomous vehicle applications, two case studies were performed using our prototype MLIS to show a dramatic reduction in collision probabilities.

This paper makes the following contributions:

1) A new in-sensor computing architecture is proposed on a radar sensor board that implements RBM together with the ADAS computer.
2) A two-phase 1-step Gibbs sampling computation framework consisting of generate phase and reconstruct phase is presented.
3) Extensive experiments have been conducted to demonstrate the feasibility and performance of MLIS. Numerical results have shown up to 8 times reduction in data transfers between the sensor board and central computer with a reconstruction accuracy of 98%.
4) Braking distance gain due to latency reduction is shown to be significant when implementing the generate phase of the MLIS on an FPGA.
5) Our case studies show that the probability of collision is reduced dramatically with the new MLIS architecture.

The paper is organized as follows. Next section presents the basic architecture of MLIS. Section III presents the design of Restricted Boltzmann Machine (RBM) model followed by experimental results in Section IV. We discuss related work in Section V and conclude the paper in Section VI.
II. SYSTEM ARCHITECTURE

We propose a new architecture for I/Q data reduction and reconstruction as shown in Fig. 1. The left-hand side of the diagram represents the in-sensor computing platform consisting of the radar sensor and the FPGA board, while the right-hand side of the diagram represents the ADAS computing platform. The implementation of MLIS was split into two distinct phases: Generate and Reconstruct phases as shown in Fig. 2. Later, we will refer to this model as rtl+sw based MLIS since MLIS generate phase is implemented with RTL abstraction model and reconstruct phase is implemented with a software model.

![Diagram of Proposed Architecture](image)

**Fig. 1** Proposed Architecture for I/Q Data Reduction and Reconstruction.

The generate and reconstruct phase computations are performed using the following equations, respectively.

\[
h = v^T W + b
\]

\[
v = h^T W + a
\]

where \(b\) is the hidden layer units bias vector, \(a\) is the visible layer units bias vector, \(W\) is the connection weights between the visible and hidden layer units, \(v\) and \(h\) are visible and hidden layer unit activations, respectively.

III. RESULTS AND DISCUSSIONS

A. I/Q Data Reconstruction Accuracy

Fig. 3 and Fig. 4 shows sample plots of true and reconstructed I and Q code data reconstructed by the proposed MLIS architecture, respectively. The original true values are plotted using solid blue lines, the sw based reconstructed data are plotted using dashed red lines and the rtl+sw based reconstructed data are plotted using solid green lines.

To quantify the accuracy assessment, we use Normalized Euclidean Distance (NED) between the two time-series data: reconstructed and true I/Q codes. TABLE I shows the measured statistics for sw and rtl+sw MLIS architectures with different sizes of visible and hidden layer units \((v,h)\). As seen from TABLE I, rtl+sw based MLIS with \((512, 64)\) configuration provides a data reduction factor of 8 with 98.03% accuracy for I code data and 95.7% accuracy for Q code data. For \((256, 64)\) configuration, MLIS provides a data reduction factor of 4 with 98.3% accuracy for I code data and 94.1% accuracy for Q code data.

![Diagram of Generate and Reconstruct Phases](image)

**Fig. 2** The Generate and Reconstruct Phases of MLIS.

TABLE I

<table>
<thead>
<tr>
<th>((v,h))</th>
<th>Type</th>
<th>Accuracy Statistics (1-NED)%</th>
</tr>
</thead>
<tbody>
<tr>
<td>((512,256))</td>
<td>sw</td>
<td>98.1%</td>
</tr>
<tr>
<td>((512,128))</td>
<td>sw</td>
<td>98.1%</td>
</tr>
<tr>
<td>((512,64))</td>
<td>sw</td>
<td>98.2%</td>
</tr>
<tr>
<td>((256,128))</td>
<td>sw</td>
<td>98.4%</td>
</tr>
<tr>
<td>((256,64))</td>
<td>sw</td>
<td>98.3%</td>
</tr>
</tbody>
</table>

B. Data Transfer Latency

In our experiments, the total amount of data transferred over Ethernet from the data capture board, DCA1000EVM, to the ADAS computer can be computed by multiplying the ADC sample size, loop count, frame count, bit length, LVDS lane count, and channel count. Therefore, transfer latency is given by the following equation:

\[
T = \frac{\text{ADC Sample Size} \times \text{Loop Count} \times \text{Frame Count} \times \text{Bit Length} \times \text{Lane Count} \times \text{Channel Count}}{\text{Bandwidth}}
\]

TABLE II shows the comparison in data transfer latencies for different ADC sample size configurations. The first two columns correspond to the original configuration with ADC sample sizes of 512 and 256, respectively. The last three columns correspond to the case where MLIS is deployed to reduce the dimensionality of ADC sample size to 128, 64, and 32, respectively.

![Diagram](image)

**TABLE II** Data Transfer Latencies for Different ADC Sample Size Cases

<table>
<thead>
<tr>
<th>ADC Sample Size</th>
<th>Original</th>
<th>MLIS</th>
</tr>
</thead>
<tbody>
<tr>
<td>512</td>
<td>128</td>
<td>128</td>
</tr>
<tr>
<td>256</td>
<td>64</td>
<td>64</td>
</tr>
<tr>
<td>128</td>
<td>32</td>
<td>32</td>
</tr>
<tr>
<td>Frame Count</td>
<td>10</td>
<td></td>
</tr>
<tr>
<td>Bit Length</td>
<td>16</td>
<td>22</td>
</tr>
<tr>
<td>Lane Count</td>
<td>4</td>
<td></td>
</tr>
<tr>
<td>Channel Count (I/Q)</td>
<td>2</td>
<td>2</td>
</tr>
<tr>
<td>BW (Mbps)</td>
<td>660</td>
<td></td>
</tr>
<tr>
<td>Tr. Latency (ms)</td>
<td>140</td>
<td>70</td>
</tr>
</tbody>
</table>

...
The bit length used for I/Q samples is 16-bit in the original case and 22-bit in the MLIS case with the FPGA implementation since we use ap_fixed<22,16> data format on RTL design. The average data transfer rate between the data capture board, DCA1000EVM, and ADAS computer is 600 Mbps. Each ADC sample is represented by a complex data format, consisting of a real (I code) and imaginary part (Q code) and each LVDS lane captures the complex data samples per receiver antenna. The AWS1243 radar board has 4 LVDS lanes and each lane receives I and Q codes, thus the channel count is 2. As shown in TABLE II, the achieved latency reduction (speedup) ranges from 46% (70/48) up to an order of magnitude (140/12).

C. The Model Processing Latencies

The total end-to-end data reduction and reconstruction latency is the sum of data transfer latency and model processing latency. TABLE III shows the model processing latencies of the architecture on the FPGA board (Vivado HLS FPGA) and ADAS CPU, respectively. The model processing latencies on the FPGA board (Vivado HLS FPGA) takes in the range of nanoseconds while the processing time on the ADAS CPU is on average 17 milliseconds as shown in TABLE III. TABLE IV shows the reduction in total latency including the latency of transferring the reduced radar data from the FPGA board to the ADAS computer and the latency of model processing.

D. Probabilistic Analysis of Collisions

The Collision Avoidance System (CAS) and Automated Emergency Braking System (AEBS) are the crucial functions of ADAS applications. The probability of non-collision with an object can be described as the probability of the braking distance being less than the object distance. We assume that the braking distance has a variance due to different factors such as tire tread, braking mechanics condition etc. The probability of collision can be expressed as

\[ P(\text{Collision}) = 1 - P(d_b < d_o | V = v) \]  \hspace{1cm} (4)

where \(d_b\) and \(d_o\) are braking and object distance, respectively. \(v\) is the vehicle speed. The mean value of braking distance is given by the following equation [4].

\[ \mu_{bd} = \frac{v^2}{2\mu g} \]  \hspace{1cm} (5)

where \(v\) (m/s) is the velocity, \(\mu\) is friction coefficient, \(g\) (m/s²) is the acceleration due to gravity. Then the deceleration is given by

\[ a = -\mu g. \]  \hspace{1cm} (6)

TABLE III. RTL AND SW MODEL EXECUTION LATENCIES ON HLS AND ADAS CPU

<table>
<thead>
<tr>
<th>ADC Sample Size</th>
<th>Reduced Size</th>
<th>rtl model absolute processing latency(ms) Vivado HLS</th>
<th>sw model average processing latency(ms) ADAS CPU</th>
</tr>
</thead>
<tbody>
<tr>
<td>256</td>
<td>32</td>
<td>28.567</td>
<td>17</td>
</tr>
<tr>
<td>256</td>
<td>16</td>
<td>1.430</td>
<td>17</td>
</tr>
<tr>
<td>128</td>
<td>32</td>
<td>0.87</td>
<td>17</td>
</tr>
<tr>
<td>128</td>
<td>16</td>
<td>0.78</td>
<td>17</td>
</tr>
</tbody>
</table>

TABLE IV. DATA TRANSFER AND PROCESSING LATENCY REDUCTION

<table>
<thead>
<tr>
<th>Original/Reduced Data Sample Size</th>
<th>Data Transfer Latency Reduction (ms)</th>
<th>Processing Latency Reduction (ms)</th>
<th>Total Latency Reduction (ms)</th>
</tr>
</thead>
<tbody>
<tr>
<td>256/32</td>
<td>58</td>
<td>17</td>
<td>75</td>
</tr>
<tr>
<td>256/16</td>
<td>64</td>
<td>17</td>
<td>81</td>
</tr>
<tr>
<td>128/32</td>
<td>23</td>
<td>17</td>
<td>40</td>
</tr>
<tr>
<td>128/16</td>
<td>29</td>
<td>17</td>
<td>46</td>
</tr>
</tbody>
</table>
E. Case Studies for Collision Avoidance

We present two different collision scenarios where AEBS is engaged immediately to prevent any collision. We use the deceleration profiles given in TABLE V for the two scenarios. Braking distances were computed by using the formula given in [4].

<table>
<thead>
<tr>
<th>Case</th>
<th>Deceleration (m/s²)</th>
<th>Starting Velocity (kph)</th>
<th>Final Velocity (kph)</th>
<th>Condition</th>
</tr>
</thead>
<tbody>
<tr>
<td>I</td>
<td>-8.83</td>
<td>50</td>
<td>0</td>
<td>Dry roadway</td>
</tr>
<tr>
<td>II</td>
<td>-8.83</td>
<td>70</td>
<td>10</td>
<td>Dry roadway</td>
</tr>
</tbody>
</table>

TABLE V Deceleration Profiles for Two Cases

Case I: Collision Avoidance at Pedestrian Crossing:

Let us consider the scenario depicted in Fig. 5 in which a car is cruising at a certain speed, while a pedestrian emerges suddenly from the front of a minivan to cross the street. Since the pedestrian is occluded by the minivan, the driver of the car is not capable of realizing the presence of the pedestrian in front of the minivan. This may lead to a serious collision with the pedestrian that requires AEBS to be engaged immediately to prevent an imminent collision with the pedestrian. If the car is cruising at a speed of 50 kph and if the deployed MLIS is 512/64, then the distance gained is 1.8 meters. As soon as the car detects the pedestrian on its cruise way, the ADAS will engage the AEBS to stop the car.

![Fig. 5 Case I: AEBS with Pedestrian Crossing Scenario](image)

After AEBS is engaged, the car will decelerate and stop after 11 m preventing any collision. The car would have traveled 1.8 meters farther without MLIS, causing a collision.

![Fig. 6 Case I: Collision Probability with and w/o MLIS (a) for different car speeds. (b) for different pedestrian distances while cruising at 50 kph](image)

Case II: Collision Avoidance with a Leading Car:

In this scenario, a car is cruising at the speed of 70 kph behind an SUV. The driver of the SUV makes a sudden brake to avoid a collision with an object in front of it. The car behind the SUV detects its rapid deceleration and engages the FCW system to alert the driver of the hazard. After the FCW engagement, the driver of the car starts braking until the car is distanced securely from the decelerating SUV. This scenario is depicted in Fig. 7. The MLIS with 512/64 configuration helps gain 2.6 meters of distance, greatly reduced the chance of collision. Fig. 8 (a) shows the probability of collision with and without MLIS for different speeds when the leading SUV is distanced at 21 meters away from the car when detected by the radar sensor board. As seen in Fig. 8 (a), the probability of collision is substantially lower with MLIS than that without MLIS for the same car speed. As the speed increases the probability of collision increases as expected. Similar observations are shown in Fig. 8 (b) for different leading SUV distances while the vehicle cruises at 70 kph.

![Fig. 7 Case II: Forward Collision Warning Scenario](image)

![Fig. 8 Case II: Collision Probability with and w/o MLIS (a) for different car speeds. (b) for different leading SUV distances while cruising at 70 kph.](image)

IV. RELATED WORK

The major challenge with edge computing is the limited resources available on the edge devices and a large amount of sensed raw data to be transferred to the main computer for further processing [5], [6]. Any latency may become a major bottleneck in real-time applications which have hard real-time computation requirements [7]. High Correlation Filter,
Principal Component Analysis, General Discriminant Analysis etc. are among the techniques used for the dimensionality reduction [8]. Vanilla Autoencoders and Convolutional Autoencoders are commonly used in deep neural networks to remove noise and redundant information in high-dimensional data [9]. RBM has captured researchers’ interest and many researchers produced hardware designs of the RBM model on FPGAs [10], [11]. The field of time series forecasting has received significant interest in academia and has a wide range of applications in the areas of energy, communication, business, finance, health, and sports [12], [13], [14], [15], [16], [17].

Our work in this paper differs from the studies discussed above in many substantial ways. First, our work concentrates on having a portion of machine learning computations on the radar sensor board to minimize necessary data to be transferred from the radar sensor board to the computing device. Secondly, to reconstruct the approximated data by applying 1-step Gibbs sampling, we implemented the RBM model on the FPGA and its reverse symmetric model on computing device. Third, we carried out extensive experiments to demonstrate the advantages of MLIS in a real-time application: ADAS system in autonomous vehicles. Furthermore, a probabilistic collision avoidance model showed that the probability of collision decreases dramatically with our MLIS architecture.

V. CONCLUSIONS

We introduced a novel in-sensor machine learning system, MLIS, applicable to ADAS in autonomous vehicles. The idea is to reduce the amount of data transferred from the sensor board to the main computing platform by performing machine learning computations on the sensor board. The new architecture has shown itself to be capable of substantially reducing the time for decision making, which is critical for real time applications. We used the Texas Instruments’ AWR1243 FMCW radar board and Vivado HLS to implement an experimental prototype. Experimental findings showed that it is possible to reduce data transferred from the sensor board to the central computer by a factor of up to 8 with an accuracy of 98%. The collision probabilistic model showed that the data reduction dramatically reduced the collision probability for two collision cases because of the gain in reaction time of ADAS thus increasing the vehicle agility to prevent serious collisions in the most severe cases.

ACKNOWLEDGMENT

This research is supported in part by the National Science Foundation (NSF) under grants 2027069 and 2106750. The authors thank the anonymous reviewers for their comments and suggestions.

REFERENCES


