Claude Code · 2026-05-14 · unknown branch
Session Summary: Concept trace — Multi-View Consistency across the corpus
Main outcome: Traced 'Multi-View Consistency' through the corpus as a five-stage timeline anchored to NeRF (2020), NeuS (2021), DROID-SLAM (2021), 3DGS / MonoGS (2023), and 2DGS / MVSplat (2024). The researcher redirected the scope to include the pre-NeRF MVS era, so the agent broadened to COLMAP MVS (2016), DeepV2D (2018), and UNISURF (2021), then compared definitions across papers and concluded there is no single corpus-wide definition: three distinct operationalisations (photometric reprojection, geometric depth consistency, shared-surface topology) are bundled under one name, and they are not equivalent.
High-Level Summary
Traced 'Multi-View Consistency' through the corpus as a five-stage timeline anchored to NeRF (2020), NeuS (2021), DROID-SLAM (2021), 3DGS / MonoGS (2023), and 2DGS / MVSplat (2024). The researcher redirected the scope to include the pre-NeRF MVS era, so the agent broadened to COLMAP MVS (2016), DeepV2D (2018), and UNISURF (2021), then compared definitions across papers and concluded there is no single corpus-wide definition: three distinct operationalisations (photometric reprojection, geometric depth consistency, shared-surface topology) are bundled under one name, and they are not equivalent.
Timeline & size
Key decisions
None recorded.
Files touched
None recorded.
Commands run
None recorded.
Tools used
- node_context
- search_facts
- search_nodes
- timeline
- wiki_page
Turn-by-turn conversation
Redacted user/assistant transcript turns, with assistant tool use collapsed under its response.
Trace 'Multi-View Consistency' as a concept across the corpus.
thinking: Concept node first, then pull edges, then facts.
I'll look up the concept node, expand its neighbours, and then pull temporal facts that mention it.
Tool use (1)
json{
"limit": 5,
"query": "multi-view consistency",
"type": "Concept"
}thinking: No exact Concept match. The corpus may model it as the OpenQuestion instead. Re-search across types.
No Concept node for that exact name. Let me broaden the search.
Tool use (1)
json{
"limit": 10,
"query": "multi-view consistency"
}thinking: The corpus tracks the concept through (a) the OpenQuestion node and (b) explicit fact assertions. Pull the OpenQuestion neighbours and search facts.
The concept lives as an OpenQuestion plus a related Concept (Multi-View Stereo). I'll trace via the question's neighbours and a fact search.
Tool use (4)
json{
"limit": 30,
"node_id": "OpenQuestion:multi-view-consistency"
}json{
"limit": 12,
"query": "multi-view consistency"
}json{
"limit": 20,
"query": "multi-view consistency"
}json{
"node_id": "OpenQuestion:multi-view-consistency"
}thinking: I have enough anchors to draw a five-stage timeline. Compose it.
Concept-evolution timeline for 'Multi-View Consistency' across the corpus:
- 2020-03 - NeRF (arxiv-2003-08934) Establishes photometric multi-view consistency as the implicit optimisation signal. Per-scene loss, no other supervision.
- 2021-06 - NeuS (arxiv-2106-10689) Lifts the property from 'pixel agreement' to 'view-consistent SDF zero level set'. Geometry is the thing that should be consistent, not just radiance.
- 2021-08 - DROID-SLAM (arxiv-2108-10869) Operationalises the property under aggressive motion: deep bundle adjustment on learned features remains the most robust pose engine on EuRoC / TartanAir.
- 2023-08 / 2023-12 - 3DGS (arxiv-2308-04079) and MonoGS (arxiv-2312-06741) Splatting takes over view synthesis and SLAM but evaluates only on quasi-static / well-behaved trajectories. The property is assumed, not stress-tested.
- 2024-03 - 2DGS (arxiv-2403-17888) and MVSplat (arxiv-2403-14627) Two opposing moves on the same gap. 2DGS bakes view-consistency into the primitive geometry (oriented disk). MVSplat raises the question explicitly: feed-forward prediction inherits the consistency gap from sparse views.
Open question: the gap is the fast-motion regime. None of the splatting-SLAM papers report numbers on EuRoC/TartanAir, and no feed-forward GS method has been stress-tested against multi-view-inconsistent inputs. See questions/multi-view-consistency.md.
You're tracing this through GS papers, but 'multi-view consistency' is a term that predates GS by more than a decade — classical MVS used it constantly. Search the pre-NeRF era and the early NeRF era too. I want to see if the splatting authors are using the same definition the MVS people were using.
thinking: Fair. The corpus has at least DeepV2D (arxiv-1812-04605) and MVS-Texturing-era work from 2016 — let me search for the MVS lineage. There should also be UNISURF (2021) and NeuS (2021) which sit at the implicit-surface / radiance-field boundary.
Right — the classical MVS literature owns the term. Broadening the search.
Tool use (4)
json{
"limit": 10,
"query": "multi-view stereo consistency"
}json{
"node_id": "Paper:arxiv-1607-02565"
}json{
"node_id": "Paper:arxiv-2106-10689"
}json{
"node_id": "Paper:arxiv-1812-04605"
}thinking: Three distinct operationalisations are emerging: (1) COLMAP-style photometric+geometric pair check (depth/intensity reprojection), (2) DeepV2D-style differentiable photometric warping loss, (3) NeuS-style 'shared SDF surface' which is by-construction consistency. NeRF sits closer to (2) — the implicit photometric loss across views. 3DGS+MonoGS inherit (2) but never operationalise it as a check; they assume per-scene optimisation will produce it. Let me extend the timeline backward.
I have three pre-GS anchors now. COLMAP MVS (2016) operationalises the term explicitly as a per-pair pixel check. DeepV2D (2018) makes that check differentiable as a photometric warping loss. NeuS (2021) reframes it as a geometric property of a shared SDF. That's the lineage the GS papers walked into.
Good. So is there a single definition the corpus agrees on, or are these all incompatible operationalizations of different things people happen to call by the same name?
thinking: They're not the same. Three genuinely distinct definitions: (a) photometric reprojection consistency (COLMAP, DeepV2D, NeRF photometric loss), (b) geometric reprojection / depth consistency (COLMAP's forward-backward check, DROID-SLAM bundle adjustment residual), (c) surface consistency / shared zero level set (NeuS, UNISURF, 2DGS's view-consistent disk argument). 3DGS optimises (a) implicitly and silently fails on (b) and (c) — that's why SuGaR and 2DGS exist as patches. Let me pull UNISURF to confirm the surface-consistency thread.
They're not the same. There are at least three distinct things people call 'multi-view consistency'. Let me confirm against one more paper.
Tool use (2)
json{
"node_id": "Paper:arxiv-2104-10078"
}json{
"limit": 8,
"query": "multi-view consistency photometric geometric surface"
}thinking: Now the answer is clean: three operationalisations, no agreement. Final timeline should start at 2016 (COLMAP) and end 2024 (2DGS / MVSplat), grouped by which definition each paper picks up.
No, there's no single corpus-wide definition. Three distinct operationalisations:
(1) Photometric reprojection consistency — the radiance at a 3D point matches across all views.
- COLMAP MVS (2016, arxiv-1607-02565): explicit per-pixel NCC + forward-backward depth check.
- DeepV2D (2018, arxiv-1812-04605): differentiable photometric warping loss across video frames.
- NeRF (2020, arxiv-2003-08934): implicit — the per-scene photometric loss IS the consistency objective. No explicit check.
- 3DGS (2023, arxiv-2308-04079): inherits NeRF's photometric loss; no explicit check; assumed to hold after convergence.
(2) Geometric / depth consistency — the depth at a 3D point matches across all views (a stronger property than (1) — radiance can match while depth doesn't, e.g. specular surfaces).
- COLMAP MVS: forward-backward depth reprojection error in pixels.
- DROID-SLAM (2021, arxiv-2108-10869): deep bundle adjustment residual on learned correlation features — the production version of (2) under aggressive motion.
(3) Surface / topology consistency — there is a single shared surface every view refers to; consistency is by construction, not measured.
- NeuS (2021, arxiv-2106-10689), UNISURF (2021, arxiv-2104-10078): shared SDF zero level set.
- 2DGS (2024, arxiv-2403-17888): same idea at primitive level — each primitive is a single oriented disk, the same plane seen from every camera.
The definitions are not equivalent. (3) implies (2) which implies (1); the converses are false. A scene can be photometrically consistent (1) on a specular surface while being geometrically inconsistent (2), because radiance equality doesn't pin down depth. A method that achieves (1) without checking (2) or (3) can produce a multi-view-consistent render over a geometrically wrong scene — that's the gap the 3DGS-as-SLAM-map papers fall into. The open question (multi-view-consistency under fast motion) is really three questions: which definition matters for which downstream task, and whether per-scene splat optimisation can hit (2) or (3) without explicit supervision.
Subagent sessions
No subagent transcripts attached.
Redacted preview
Traced 'Multi-View Consistency' through the corpus as a five-stage timeline anchored to NeRF (2020), NeuS (2021), DROID-SLAM (2021), 3DGS / MonoGS (2023), and 2DGS / MVSplat (2024). The researcher redirected the scope to include the pre-NeRF MVS era, so the agent broadened to COLMAP MVS (2016), DeepV2D (2018), and UNISURF (2021), then compared definitions across papers and concluded there is no single corpus-wide definition: three distinct operationalisations (photometric reprojection, geometric depth consistency, shared-surface topology) are bundled under one name, and they are not equivalent.