Skip to content

Recall previously seen landmarks with optical flow

Mateo de Mayo requested to merge mateosss/recall-minimal into main

VIDEO: https://youtu.be/5mkVgoRCnqU

The previous recall MR !31 (closed) had a couple of race conditions and other issues, figuring them out was not trivial so a rewrite was required. This version uses optical flow instead of ORB matching, thus reducing the recall overhead significantly (things are realtime again!) and increasing recall success. The recall step in the front end is completely toggleable and configurable with options in the json config file.

Some TODO's remain (TODO@mateosss) and I want to thoroughly measure the score changes with xrtslam-metrics. EDIT: Done, below.

@brunozanotti give it a look!

Video

This video shows an ad-hoc example of the MIO11 dataset with a limited number of landmarks for better visualization of the recall functionality. In the video, the first keyframe is fixed in place. Notice how the landmarks 0, 3, 4, 7, and 5, hosted by this keyframe, are recovered.

Results

Detailed metrics (from xrtslam-metrics)

These results are demonstrative and not complete, only a handful of datasets were evaluated. Multiple configurations of the recall process were evaluated. Unfortunately, there were not any significant improvements obtained from the ability to recall. A possible observed improvement is that now, when recall is enabled, you get a mix of patch and frame-to-frame optical flow to improve feature tracks length. If optical_flow_recall_over_tracking is enabled, it works as patch feature tracking that when lost uses frame-to-frame to try and recover. If it's disabled, it works as frame-to-frame that uses patch optical flow to recover.

  • Datasets used:
    • EMH*: EuRoC machine hall datasets.
    • TR*: TUM-VI room datasets.
    • MIO*: MSD (Monado SLAM datasets) Valve index other datasets.
  • Configurations used:
    • base: Base (previous version)
    • basesr: Same as base but using new safe_radius property to ignore black image corners
    • recf2f: Same as basesr but when a feature track is lost, uses recall to try to recover it with original patch
    • rec: Same as basesr but it always tracks features by recalling (original patch) and when lost it tracks with frame-to-frame patch
    • recnom: Same as rec but it doesnt forget about out of sight landmarks
    • recnomcam: Same as recnom but it also applies recall on cameras different from cam0
    • recnomcamfwd: Same as recnomcam but it uses a keyframe marginalization criteria based on spreading forward lookat vectors of the window keyframes
    • basesr15: Same as base but uses 15 keyframes
    • recnomcamfwd15: Same as recnomcamfwd but uses 15 keyframes

Average (± stdev) pose estimation time [ms]

base basesr basesr15 rec recf2f recnom recnomcam recnomcamfwd recnomcamfwd15
EMH02 12.86 ± 9.41 13.10 ± 6.67 74.94 ± 45.71 12.15 ± 4.16 13.12 ± 8.49 29.51 ± 27.64 31.94 ± 31.13 16.64 ± 10.20 90.65 ± 45.18
EMH04 12.41 ± 9.20 11.77 ± 6.64 78.39 ± 32.66 10.75 ± 3.23 13.31 ± 10.76 41.21 ± 31.37 44.53 ± 33.18 25.56 ± 21.75 118.96 ± 45.90
MIO05 10.88 ± 1.32 10.71 ± 1.37 12.66 ± 2.25 10.99 ± 1.43 10.72 ± 1.42 11.74 ± 1.77 12.76 ± 2.28 12.64 ± 2.27 22.48 ± 6.39
MIO06 10.81 ± 1.37 10.63 ± 1.42 12.00 ± 2.19 10.88 ± 1.50 10.76 ± 1.50 12.29 ± 1.99 12.85 ± 2.34 13.01 ± 2.41 23.00 ± 6.99
MIO07 10.84 ± 1.64 10.69 ± 1.67 12.72 ± 2.83 10.81 ± 1.66 10.53 ± 1.49 11.54 ± 1.57 11.90 ± 1.83 12.75 ± 2.41 19.72 ± 5.44
MIO08 10.85 ± 1.20 10.82 ± 1.14 12.08 ± 1.82 10.82 ± 1.07 10.67 ± 1.00 11.87 ± 1.43 12.62 ± 1.87 13.28 ± 2.03 22.49 ± 6.52
TR5 4.81 ± 0.61 4.73 ± 0.56 5.83 ± 0.98 4.84 ± 0.57 4.80 ± 0.65 6.09 ± 1.12 7.07 ± 1.69 7.40 ± 1.96 55.16 ± 53.26
TR6 5.03 ± 0.66 5.03 ± 0.62 6.43 ± 1.00 5.15 ± 0.70 5.08 ± 0.72 5.83 ± 0.98 6.61 ± 1.20 7.50 ± 1.82 138.62 ± 59.67
[AVG] 9.81 ± 3.18 9.68 ± 2.51 26.88 ± 11.18 9.55 ± 1.79 9.87 ± 3.25 16.26 ± 8.48 17.54 ± 9.44 13.60 ± 5.61 61.39 ± 28.67

Average feature count for each camera

base basesr basesr15 rec recf2f recnom recnomcam recnomcamfwd recnomcamfwd15
EMH02 [162 116] ± [82 70] [162 116] ± [81 70] [169 121] ± [82 71] [162 113] ± [80 68] [162 114] ± [80 69] [164 112] ± [80 70] [163 123] ± [80 67] [145 114] ± [84 65] [147 112] ± [85 65]
EMH04 [129 114] ± [58 49] [129 114] ± [58 49] [135 121] ± [59 51] [130 108] ± [58 48] [131 109] ± [58 47] [130 105] ± [58 49] [131 117] ± [57 48] [121 109] ± [58 49] [125 111] ± [59 51]
MIO05 [66 34] ± [33 22] [65 34] ± [33 22] [72 38] ± [34 23] [71 29] ± [34 20] [71 29] ± [34 20] [79 25] ± [36 18] [79 60] ± [36 28] [66 50] ± [37 28] [74 56] ± [38 30]
MIO06 [44 28] ± [33 22] [44 28] ± [33 22] [46 29] ± [35 23] [49 24] ± [34 20] [49 24] ± [34 20] [58 20] ± [37 17] [58 45] ± [37 30] [50 38] ± [36 28] [56 42] ± [39 30]
MIO07 [46 26] ± [30 19] [46 27] ± [31 20] [51 30] ± [32 21] [49 22] ± [31 18] [51 22] ± [32 18] [56 19] ± [31 16] [56 44] ± [33 27] [53 42] ± [34 26] [59 47] ± [37 28]
MIO08 [29 22] ± [18 16] [29 22] ± [18 16] [31 24] ± [21 19] [33 17] ± [18 12] [33 17] ± [18 12] [41 15] ± [22 11] [40 33] ± [22 18] [35 29] ± [21 18] [40 32] ± [24 20]
TR5 [111 99] ± [33 29] [104 99] ± [31 29] [106 101] ± [31 29] [105 94] ± [31 28] [105 94] ± [31 28] [108 83] ± [30 28] [107 102] ± [30 28] [100 95] ± [32 31] [100 95] ± [34 32]
TR6 [144 125] ± [38 34] [138 134] ± [37 37] [142 138] ± [36 35] [137 121] ± [37 35] [138 122] ± [36 35] [140 111] ± [34 34] [140 135] ± [34 33] [135 128] ± [40 38] [134 130] ± [40 40]
[AVG] [92 71] ± [41 33] [90 72] ± [40 33] [94 75] ± [41 34] [92 66] ± [41 31] [93 66] ± [41 31] [97 61] ± [41 30] [97 82] ± [41 35] [88 76] ± [43 35] [92 78] ± [45 37]

Average completion percentage [%]

base basesr basesr15 rec recf2f recnom recnomcam recnomcamfwd recnomcamfwd15
EMH02
EMH04
MIO05
MIO06
MIO07
MIO08
TR5
TR6
[AVG]

Absolute trajectory error (ATE) [m]

base basesr basesr15 rec recf2f recnom recnomcam recnomcamfwd recnomcamfwd15
EMH02 0.054 ± 0.027 0.057 ± 0.029 0.044 ± 0.019 0.048 ± 0.024 0.052 ± 0.021 0.073 ± 0.031 0.081 ± 0.033 0.059 ± 0.021 0.096 ± 0.049
EMH04 0.128 ± 0.047 0.138 ± 0.045 0.145 ± 0.056 0.133 ± 0.053 0.118 ± 0.044 0.127 ± 0.049 0.157 ± 0.066 0.111 ± 0.052 0.163 ± 0.079
MIO05 0.039 ± 0.016 0.035 ± 0.012 0.031 ± 0.010 0.040 ± 0.018 0.040 ± 0.016 0.027 ± 0.012 0.027 ± 0.012 0.079 ± 0.052 0.100 ± 0.057
MIO06 0.050 ± 0.019 0.049 ± 0.020 0.052 ± 0.021 0.063 ± 0.020 0.058 ± 0.022 0.131 ± 0.047 0.064 ± 0.024 0.157 ± 0.078 0.347 ± 0.124
MIO07 0.020 ± 0.009 0.023 ± 0.011 0.021 ± 0.009 0.021 ± 0.009 0.021 ± 0.008 0.019 ± 0.008 0.018 ± 0.007 0.041 ± 0.029 0.026 ± 0.013
MIO08 0.058 ± 0.019 0.057 ± 0.019 0.056 ± 0.019 0.049 ± 0.014 0.052 ± 0.017 0.050 ± 0.016 0.032 ± 0.014 0.110 ± 0.040 0.149 ± 0.053
TR5 0.167 ± 0.066 0.172 ± 0.067 0.161 ± 0.063 0.179 ± 0.071 0.179 ± 0.071 0.104 ± 0.034 0.067 ± 0.021 0.114 ± 0.080 8.083 ± 5.852
TR6 0.018 ± 0.008 0.021 ± 0.010 0.020 ± 0.009 0.023 ± 0.011 0.024 ± 0.011 0.021 ± 0.008 0.021 ± 0.009 0.031 ± 0.019 0.250 ± 0.089
[AVG] 0.067 ± 0.026 0.069 ± 0.027 0.066 ± 0.026 0.070 ± 0.027 0.068 ± 0.026 0.069 ± 0.026 0.058 ± 0.023 0.088 ± 0.046 1.152 ± 0.790

Relative trajectory error (RTE) [m]

base basesr basesr15 rec recf2f recnom recnomcam recnomcamfwd recnomcamfwd15
EMH02 0.004329 ± 0.002343 0.004419 ± 0.002405 0.004100 ± 0.002167 0.004153 ± 0.002187 0.004258 ± 0.002290 0.005670 ± 0.003609 0.006890 ± 0.005006 0.009928 ± 0.008238 0.026622 ± 0.024411
EMH04 0.012654 ± 0.008158 0.012670 ± 0.008132 0.012459 ± 0.007780 0.012541 ± 0.007866 0.012709 ± 0.007867 0.014366 ± 0.009099 0.016300 ± 0.011020 0.015243 ± 0.009976 0.024412 ± 0.019938
MIO05 0.003207 ± 0.002213 0.003283 ± 0.002283 0.002996 ± 0.002049 0.002949 ± 0.001886 0.002970 ± 0.001895 0.003512 ± 0.002283 0.003391 ± 0.002245 0.019763 ± 0.015659 0.018908 ± 0.014354
MIO06 0.010514 ± 0.008618 0.010489 ± 0.008601 0.010517 ± 0.008681 0.010723 ± 0.008798 0.010807 ± 0.008852 0.014595 ± 0.012118 0.014848 ± 0.012368 0.050932 ± 0.036953 0.070102 ± 0.055537
MIO07 0.002425 ± 0.001338 0.002452 ± 0.001348 0.002271 ± 0.001292 0.002568 ± 0.001428 0.002497 ± 0.001326 0.003637 ± 0.002198 0.003741 ± 0.002292 0.013141 ± 0.011023 0.008980 ± 0.006774
MIO08 0.007246 ± 0.004699 0.007178 ± 0.004626 0.007065 ± 0.004586 0.007684 ± 0.004952 0.007638 ± 0.004926 0.009658 ± 0.006122 0.011545 ± 0.007012 0.036824 ± 0.025922 0.046811 ± 0.032161
TR5 0.009210 ± 0.005524 0.009253 ± 0.005522 0.009195 ± 0.005494 0.009302 ± 0.005546 0.009244 ± 0.005492 0.010102 ± 0.005745 0.010578 ± 0.006211 0.062505 ± 0.052360 0.214259 ± 0.179424
TR6 0.003923 ± 0.002226 0.004068 ± 0.002348 0.003956 ± 0.002284 0.004037 ± 0.002286 0.004176 ± 0.002413 0.004444 ± 0.002613 0.004390 ± 0.002422 0.008954 ± 0.006214 0.065965 ± 0.051099
[AVG] 0.006688 ± 0.004390 0.006726 ± 0.004408 0.006570 ± 0.004292 0.006745 ± 0.004368 0.006787 ± 0.004383 0.008248 ± 0.005473 0.008960 ± 0.006072 0.027161 ± 0.020793 0.059507 ± 0.047962

Segment drift per meter error (SDM 0.01m) [m/m]

base basesr basesr15 rec recf2f recnom recnomcam recnomcamfwd recnomcamfwd15
EMH02 0.0238 ± 0.0192 0.0247 ± 0.0163 0.0218 ± 0.0236 0.0254 ± 0.0395 0.0227 ± 0.0154 0.0856 ± 0.3192 0.0837 ± 0.1854 0.2049 ± 0.3661 0.4323 ± 0.5261
EMH04 0.0717 ± 0.1356 0.0709 ± 0.1488 0.0695 ± 0.1190 0.0679 ± 0.1228 0.0681 ± 0.1137 0.0919 ± 0.2678 0.1060 ± 0.1921 0.0875 ± 0.1650 0.1448 ± 0.2259
MIO05 0.1181 ± 0.1751 0.1217 ± 0.1850 0.0997 ± 0.1529 0.0795 ± 0.0901 0.0788 ± 0.0923 0.1291 ± 0.2160 0.0930 ± 0.1000 0.6286 ± 0.3950 0.6498 ± 0.4060
MIO06 0.2585 ± 0.3661 0.2660 ± 0.3805 0.2593 ± 0.3675 0.2612 ± 0.3558 0.2648 ± 0.3533 0.2972 ± 0.3478 0.3083 ± 0.3631 0.6819 ± 0.4035 0.7296 ± 0.4243
MIO07 0.1196 ± 0.3946 0.1089 ± 0.2037 0.1141 ± 0.2252 0.1092 ± 0.1738 0.1017 ± 0.1878 0.2008 ± 0.2690 0.1511 ± 0.2068 0.4738 ± 0.3695 0.4393 ± 0.3404
MIO08 0.1375 ± 0.1289 0.1397 ± 0.1264 0.1337 ± 0.1056 0.1469 ± 0.1457 0.1551 ± 0.1942 0.2162 ± 0.2067 0.2828 ± 0.3165 0.7686 ± 0.4503 0.7970 ± 0.4701
TR5 0.0591 ± 0.0704 0.0570 ± 0.0690 0.0597 ± 0.0833 0.0586 ± 0.0865 0.0582 ± 0.0813 0.0613 ± 0.0636 0.0662 ± 0.0660 0.4011 ± 0.4671 0.6059 ± 0.6546
TR6 0.0212 ± 0.0261 0.0180 ± 0.0170 0.0198 ± 0.0260 0.0222 ± 0.0271 0.0264 ± 0.0358 0.0531 ± 0.2186 0.0267 ± 0.0331 0.1459 ± 0.2633 0.6125 ± 0.5577
[AVG] 0.1012 ± 0.1645 0.1009 ± 0.1433 0.0972 ± 0.1379 0.0964 ± 0.1301 0.0970 ± 0.1342 0.1419 ± 0.2386 0.1397 ± 0.1829 0.4240 ± 0.3600 0.5514 ± 0.4506

EDIT: The ATE results from recnomcamfwd15 were pretty bad when I was hoping them to be the best. After analyzing, for example, the TUM-VI room datasets (TR), I tweaked it a bit to use less features and it beats all the other versions giving an average ATE of 0.036 (scores TR1-5: 0.031, 0.037, 0.062, 0.035, 0.030, 0.021), compared to the previous 0.094 average. This is something to study a bit more, meanwhile here is a video of a run on the TR5 dataset: https://youtu.be/5mkVgoRCnqU

Edited by Mateo de Mayo

Merge request reports