2.4 KB · updated 2026-07-06 · md

digest.md

examples/demo-corpus/data/research/daily/2026-04-30/digest.md

Daily digest — 2026-04-30

SLAM day. The thread I wanted to chase: how the neural-field representations absorbed the SLAM pipeline, and what's still classical underneath.

DROID-SLAM (Teed & Deng, 2021) is the deep-learning end-to-end SLAM baseline. The trick is a recurrent update operator that jointly refines depth and pose via a differentiable bundle adjustment layer. The princeton-vl/DROID-SLAM release is still the cleanest "deep SLAM that actually runs in real time" reference codebase. Notable that the underlying BA is conceptually the same as in DSO — what changed is that the feature extractor and the update rule are now learned.

The neural-implicit SLAM line starts with NICE-SLAM (Zhu et al., 2021): a hierarchical voxel grid of features, optimised online for both mapping and tracking. Point-SLAM (Sandström et al., 2023) trades the grid for a neural point cloud — adaptive resolution where the scene needs it. Co-SLAM (Wang et al., 2023) re-introduces a sparse parametric encoding for speed, the same trade-off Instant-NGP made for offline NeRF.

The new arrivals are the Gaussian-Splatting SLAM systems. GS-SLAM (Yan et al., 2023) and Gaussian Splatting SLAM / MonoGS (Matsuki et al., 2023) both swap the implicit volume for explicit 3D Gaussians, retain the differentiable rasteriser from the base 3DGS paper, and add online keyframe management. MonoGS in particular reports the highest-quality photometric reconstructions of any monocular SLAM system in the corpus.

The open question — what happens when the camera moves fast — is what I want to write up next week. None of these systems benchmarks aggressively on motion-blurred or fast-pan captures.

Six papers, ~55 min. Heavy day. Diffusion-based generation tomorrow.