LiDAR-360 RGB Camera-360 Thermal Camera Targetless Calibration for Dynamic Situations


Author Contributions

Conceptualization, K.B.T. and A.C.; Methodology, K.B.T.; Software, K.B.T.; Validation, K.B.T. and A.C.; Data curation, K.B.T. and A.C.; Writing—original draft, K.B.T.; Writing—review and editing, K.B.T. and A.C.; Visualization, K.B.T., A.C. and K.T.; Supervision, A.C. and K.T. All authors have read and agreed to the published version of the manuscript.

Figure 1.
Visualization of the system including RGB cameras, thermal cameras, and LiDAR. 360 RGB camera and 360 thermal camera are made from independent cameras to remove blind spots. Images and point clouds are compensated to decrease negative impacts of motion. Then, point clouds and images are used for sensor calibration based on extracted features.

Figure 1.
Visualization of the system including RGB cameras, thermal cameras, and LiDAR. 360 RGB camera and 360 thermal camera are made from independent cameras to remove blind spots. Images and point clouds are compensated to decrease negative impacts of motion. Then, point clouds and images are used for sensor calibration based on extracted features.

Figure 2.
Visualization of the target detected by two types of cameras. The (left image) is the target detected by the RGB camera and the (right image) is the target detected by the thermal camera.

Figure 2.
Visualization of the target detected by two types of cameras. The (left image) is the target detected by the RGB camera and the (right image) is the target detected by the thermal camera.

Sensors 24 07199 g002
Figure 3.
Our system includes sensors: LiDAR Velodyne Alpha Prime, LadyBug-5 camera, 6 FLIR ADK cameras, LiDAR Ouster-128, LiDAR Ouster-64 and LiDAR Hesai Pandar.

Figure 3.
Our system includes sensors: LiDAR Velodyne Alpha Prime, LadyBug-5 camera, 6 FLIR ADK cameras, LiDAR Ouster-128, LiDAR Ouster-64 and LiDAR Hesai Pandar.

Sensors 24 07199 g003
Figure 4.
Visualization of stitching 360 thermal images and 360 RGB images.

Figure 4.
Visualization of stitching 360 thermal images and 360 RGB images.

Sensors 24 07199 g004
Figure 5.
Pipeline of the registration process. The approach is divided into two parts, one part focuses on detecting key points from RGB images and thermal images, while the other part detects key points from images converted from LiDAR point clouds. For images generated from LiDAR point clouds, a velocity estimation step is required to perform distortion correction, ensuring the accurate positioning of the scanned points. After getting results from distortion correction, external parameters of LiDAR, 360 RGB camera and 360 thermal camera can be calibrated.

Figure 5.
Pipeline of the registration process. The approach is divided into two parts, one part focuses on detecting key points from RGB images and thermal images, while the other part detects key points from images converted from LiDAR point clouds. For images generated from LiDAR point clouds, a velocity estimation step is required to perform distortion correction, ensuring the accurate positioning of the scanned points. After getting results from distortion correction, external parameters of LiDAR, 360 RGB camera and 360 thermal camera can be calibrated.

Sensors 24 07199 g005
Figure 6.
Visualization of features extracted from RGB images.

Figure 6.
Visualization of features extracted from RGB images.

Sensors 24 07199 g006
Figure 7.
Pipeline of our approach. The first step is enhancing images by Retinex Decomposition. The second step is to extract key features from n + 1 consecutive images. The third step is using MobileNetV3 to remove noise features on moving objects.

Figure 7.
Pipeline of our approach. The first step is enhancing images by Retinex Decomposition. The second step is to extract key features from n + 1 consecutive images. The third step is using MobileNetV3 to remove noise features on moving objects.

Sensors 24 07199 g007
Figure 8.
The (above image) shows results before being enhanced by Retinex Decomposition. The (below image) shows results after being enhanced by Retinex Decomposition.

Figure 8.
The (above image) shows results before being enhanced by Retinex Decomposition. The (below image) shows results after being enhanced by Retinex Decomposition.

Sensors 24 07199 g008
Figure 9.
The (above image) including the red rectangles shows reliable features extracted from n + 1 consecutive RGB images. The (below image) including the green rectangles shows reliable features after filtering by MobileNetV3.

Figure 9.
The (above image) including the red rectangles shows reliable features extracted from n + 1 consecutive RGB images. The (below image) including the green rectangles shows reliable features after filtering by MobileNetV3.

Sensors 24 07199 g009
Figure 10.
Visualization of features extracted from thermal images.

Figure 10.
Visualization of features extracted from thermal images.

Sensors 24 07199 g010
Figure 11.
Visualization of image projection. (a) shows the 3D point cloud data from the LiDAR. (b) presents the 2D image data with the intensity channel. (c) presents the 2D image data with the range channel. The height of the image is 128, corresponding to the number of channels in the LiDAR.

Figure 11.
Visualization of image projection. (a) shows the 3D point cloud data from the LiDAR. (b) presents the 2D image data with the intensity channel. (c) presents the 2D image data with the range channel. The height of the image is 128, corresponding to the number of channels in the LiDAR.

Sensors 24 07199 g011
Figure 12.
Visualization of key points extracted from LiDAR images. (a) simulates key points across two frames, while (b) simulates selecting key points with similarity across the two frames.

Figure 12.
Visualization of key points extracted from LiDAR images. (a) simulates key points across two frames, while (b) simulates selecting key points with similarity across the two frames.

Sensors 24 07199 g012
Figure 13.
Pipeline of our approach. Key features of projected images are extracted by Superpoint enhanced by LSTM. These features are matched to find pair points in two consecutive frames.

Figure 13.
Pipeline of our approach. Key features of projected images are extracted by Superpoint enhanced by LSTM. These features are matched to find pair points in two consecutive frames.

Sensors 24 07199 g013
Figure 14.
Pipeline of the ego-motion compensation process. First, the point clouds are converted into two-dimensional images using Spherical Projection. Key features are then identified within these range images, and corresponding point pairs are matched. By matching key feature pairs, the distance between frames can be determined, allowing for velocity estimation. Finally, velocity and timestamp will be used to resolve ego-motion compensation and point cloud accumulation.

Figure 14.
Pipeline of the ego-motion compensation process. First, the point clouds are converted into two-dimensional images using Spherical Projection. Key features are then identified within these range images, and corresponding point pairs are matched. By matching key feature pairs, the distance between frames can be determined, allowing for velocity estimation. Finally, velocity and timestamp will be used to resolve ego-motion compensation and point cloud accumulation.

Sensors 24 07199 g014
Figure 15.
Visualization of distortion correction. The motion of the vehicle is presented by the circles, and the LiDAR is also sotating while the vehicle is in motion. (a) shows the actual shape of the obstacle. (b) depicts the shape of the obstacle scanned by LiDAR. (c) illustrates the shape of the obstacle after distortion correction.

Figure 15.
Visualization of distortion correction. The motion of the vehicle is presented by the circles, and the LiDAR is also sotating while the vehicle is in motion. (a) shows the actual shape of the obstacle. (b) depicts the shape of the obstacle scanned by LiDAR. (c) illustrates the shape of the obstacle after distortion correction.

Sensors 24 07199 g015
Figure 16.
Visualization of the differences in distortion correction on 3D point clouds within a frame with a speed of 54 km/h and a frequency of 10 Hz. The red part shows the original points of the point clouds, while the green part shows the corrected points. The left image shows points on the x y -plane. The right image shows points on the y z -plane.

Figure 16.
Visualization of the differences in distortion correction on 3D point clouds within a frame with a speed of 54 km/h and a frequency of 10 Hz. The red part shows the original points of the point clouds, while the green part shows the corrected points. The left image shows points on the x y -plane. The right image shows points on the y z -plane.

Sensors 24 07199 g016
Figure 17.
Visualization of distortion correction of cameras. The blue rectangle is the actual shape, and the red rectangle is the shape distorted by ego-motion.

Figure 17.
Visualization of distortion correction of cameras. The blue rectangle is the actual shape, and the red rectangle is the shape distorted by ego-motion.

Sensors 24 07199 g017
Figure 18.
Visualization of 360 RGB–LiDAR images calibration. The (above image) including the red rectangles indicate the calibration results before applying correction. The (below image) including the green rectangles indicates the calibration results after applying correction.

Figure 18.
Visualization of 360 RGB–LiDAR images calibration. The (above image) including the red rectangles indicate the calibration results before applying correction. The (below image) including the green rectangles indicates the calibration results after applying correction.

Sensors 24 07199 g018
Figure 19.
Visualization of 360 RGB–thermal images calibration. The (above image) includes red rectangles that indicate the calibration results before applying correction. The (below image) includes green rectangles that indicate the calibration results after applying correction.

Figure 19.
Visualization of 360 RGB–thermal images calibration. The (above image) includes red rectangles that indicate the calibration results before applying correction. The (below image) includes green rectangles that indicate the calibration results after applying correction.

Sensors 24 07199 g019
Figure 20.
Visualization of point clouds extracted from Ouster OS1-128 and Velodyne Alpha prime.

Figure 20.
Visualization of point clouds extracted from Ouster OS1-128 and Velodyne Alpha prime.

Sensors 24 07199 g020
Figure 21.
Visualization of image projection. (a) presents the 2D image data with the intensity channel from Ouster OS1-128. (b) presents the 2D image data with the range channel from Ouster OS1-128. (c) presents the 2D image data with the intensity channel from Velodyne Alpha prime. (d) presents the 2D image data with the range channel from Velodyne Alpha prime.

Figure 21.
Visualization of image projection. (a) presents the 2D image data with the intensity channel from Ouster OS1-128. (b) presents the 2D image data with the range channel from Ouster OS1-128. (c) presents the 2D image data with the intensity channel from Velodyne Alpha prime. (d) presents the 2D image data with the range channel from Velodyne Alpha prime.

Sensors 24 07199 g021
Figure 23.
Comparison with CNN, RIFT, RI-MFM by MAE.

Figure 23.
Comparison with CNN, RIFT, RI-MFM by MAE.

Sensors 24 07199 g023
Figure 24.
Comparison with CNN, RIFT, RI-MFM by Accuracy.

Figure 24.
Comparison with CNN, RIFT, RI-MFM by Accuracy.

Sensors 24 07199 g024
Figure 25.
Comparison with CNN, RIFT, RI-MFM by RMSE.

Figure 25.
Comparison with CNN, RIFT, RI-MFM by RMSE.

Sensors 24 07199 g025
Figure 26.
Red points represent the results of calibration without distortion correction, while blue points represent the results with distortion correction in static situations. The dashed line is the results from the target based method.

Figure 26.
Red points represent the results of calibration without distortion correction, while blue points represent the results with distortion correction in static situations. The dashed line is the results from the target based method.

Sensors 24 07199 g026
Figure 27.
Red points and blue points represent the results of calibration without and with distortion correction in dynamic situations. The dashed lines present the results using the actual data.

Figure 27.
Red points and blue points represent the results of calibration without and with distortion correction in dynamic situations. The dashed lines present the results using the actual data.

Sensors 24 07199 g027
Figure 28.
Comparison of error in rotation and translation of three methods.

Figure 28.
Comparison of error in rotation and translation of three methods.

Sensors 24 07199 g028

Table 1.
Maximum and average velocity measurements.

Table 1.
Maximum and average velocity measurements.

NameMax VelocityMean Velocity
Ouster9.51 m/s5.56 m/s
Velodyne9.74 m/s5.59 m/s
Ground Truth9.47 m/s5.59 m/s

Table 2.
Mean errors of targetless calibration in static situations.

Table 2.
Mean errors of targetless calibration in static situations.

Mean ErrorsWithout Ego-Motion CompensationWith Ego-Motion Compensation
Roll Error (degrees) (°)1.07141.0886
Pitch Error (degrees) (°)0.21620.3624
Yaw Error (degrees) (°)0.79991.0015
Translation (X) Error (m)0.00320.0054
Translation (Y) Error (m)0.00510.0081
Translation (Z) Error (m)0.00480.0064

Table 3.
Mean errors of targetless calibration in dynamic situations.

Table 3.
Mean errors of targetless calibration in dynamic situations.

VelocityMean ErrorsWithout Ego-Motion CompensationWith Ego-Motion Compensation
2 m/s–3 m/sRoll Error (degrees) (°)1.24461.1996
Pitch Error (degrees) (°)0.44480.3734
Yaw Error (degrees) (°)1.18741.1131
Translation (X) Error (m)0.01840.0121
Translation (Y) Error (m)0.01360.0086
Translation (Z) Error (m)0.00870.0080
3 m/s–4 m/sRoll Error (degrees) (°)1.31901.2146
Pitch Error (degrees) (°)0.57790.3440
Yaw Error (degrees) (°)1.26191.1733
Translation (X) Error (m)0.02680.0152
Translation (Y) Error (m)0.01850.0093
Translation (Z) Error (m)0.00990.0082
4 m/s–5 m/sRoll Error (degrees) (°)1.37971.2636
Pitch Error (degrees) (°)0.62000.3866
Yaw Error (degrees) (°)1.30211.1652
Translation (X) Error (m)0.02390.0137
Translation (Y) Error (m)0.02100.0102
Translation (Z) Error (m)0.01010.0088
5 m/s–6 m/sRoll Error (degrees) (°)1.44971.3053
Pitch Error (degrees) (°)0.73870.4228
Yaw Error (degrees) (°)1.38701.2093
Translation (X) Error (m)0.02130.0148
Translation (Y) Error (m)0.02060.0105
Translation (Z) Error (m)0.01100.0087
6 m/s–7 m/sRoll Error (degrees) (°)1.55211.2817
Pitch Error (degrees) (°)0.64820.4519
Yaw Error (degrees) (°)1.47431.2412
Translation (X) Error (m)0.02420.0157
Translation (Y) Error (m)0.01990.0108
Translation (Z) Error (m)0.01160.0094
7 m/s–8 m/sRoll Error (degrees) (°)1.64951.3175
Pitch Error (degrees) (°)0.77190.4841
Yaw Error (degrees) (°)1.64651.2775
Translation (X) Error (m)0.02360.0162
Translation (Y) Error (m)0.01890.0106
Translation (Z) Error (m)0.01320.0091
8 m/s–9 m/sRoll Error (degrees) (°)1.76631.3422
Pitch Error (degrees) (°)0.83770.5072
Yaw Error (degrees) (°)1.70071.2710
Translation (X) Error (m)0.02490.0154
Translation (Y) Error (m)0.02260.0114
Translation (Z) Error (m)0.01470.0110
9 m/s–9.5 m/sRoll Error (degrees) (°)1.93361.3637
Pitch Error (degrees) (°)0.99960.5446
Yaw Error (degrees) (°)1.81241.3053
Translation (X) Error (m)0.02560.0184
Translation (Y) Error (m)0.02410.0138
Translation (Z) Error (m)0.01530.0113



Source link

Khanh Bao Tran www.mdpi.com