The current evaluation metrics (RMSE and MAE) can be quite misleading. Mixed-depth or blurry depth map outputs may lead to lower RMSE, and are only weakly penalized in MAE (see Section 3 of Depth Coefficients for Depth Completion). Let's take a closer look at the point clouds created by re-projecting the depth maps into 3D space:
RMSE and MAE metrics are almost the same for the Typical Deep Network and Ours, but structure is clearly preserved in our method.
Orange: Pedestrians and cyclist (note the bicycle wheels) Green: Thin poles, Red: Even far away cyclists are reasonably recovered.
Advantages to our method:
- Fast, runs on CPU only
- Structure preserving
- No training data required
- Only LiDAR data required
Disadvantages:
- Does not use RGB or LiDAR intensity information, which can provide useful semantic information for better reconstruction
- Shape of some objects are not fully recovered when LiDAR points are lost
How to tell if a method is structure preserving on the test server:
- Test Image 1: Check for snowmen (Do the pedestrians look like pedestrians or large blobs?)
- Test Image 6: Closest pole should have very low D1 Error (Do the depths show a pole, or has it been warped?)