Claude Code · 2026-05-14 · unknown branch

Session Summary: Concept trace — Multi-View Consistency across the corpus

Main outcome: Traced 'Multi-View Consistency' through the corpus as a five-stage timeline anchored to NeRF (2020), NeuS (2021), DROID-SLAM (2021), 3DGS / MonoGS (2023), and 2DGS / MVSplat (2024). The researcher redirected the scope to include the pre-NeRF MVS era, so the agent broadened to COLMAP MVS (2016), DeepV2D (2018), and UNISURF (2021), then compared definitions across papers and concluded there is no single corpus-wide definition: three distinct operationalisations (photometric reprojection, geometric depth consistency, shared-surface topology) are bundled under one name, and they are not equivalent.

11Messages
12Tool calls
0Tokens
0Subagents

High-Level Summary

Traced 'Multi-View Consistency' through the corpus as a five-stage timeline anchored to NeRF (2020), NeuS (2021), DROID-SLAM (2021), 3DGS / MonoGS (2023), and 2DGS / MVSplat (2024). The researcher redirected the scope to include the pre-NeRF MVS era, so the agent broadened to COLMAP MVS (2016), DeepV2D (2018), and UNISURF (2021), then compared definitions across papers and concluded there is no single corpus-wide definition: three distinct operationalisations (photometric reprojection, geometric depth consistency, shared-surface topology) are bundled under one name, and they are not equivalent.

Timeline & size

Project
Tesserae
Started
2026-05-14T10:31:05Z
Ended
2026-05-14T10:50:20Z
Model
claude-opus-4-7
Harness
claude-code
Raw transcript
examples/demo-corpus/.agent-sessions/2026-05-14-concept-trace/transcript.jsonl

Key decisions

None recorded.

Files touched

None recorded.

Commands run

None recorded.

Tools used

  • node_context
  • search_facts
  • search_nodes
  • timeline
  • wiki_page

Turn-by-turn conversation

Redacted user/assistant transcript turns, with assistant tool use collapsed under its response.

#1User

Trace 'Multi-View Consistency' as a concept across the corpus.

#2Assistant

thinking: Concept node first, then pull edges, then facts.

I'll look up the concept node, expand its neighbours, and then pull temporal facts that mention it.

Tool use (1)
#1search_nodes
json{
  "limit": 5,
  "query": "multi-view consistency",
  "type": "Concept"
}
#3Assistant

thinking: No exact Concept match. The corpus may model it as the OpenQuestion instead. Re-search across types.

No Concept node for that exact name. Let me broaden the search.

Tool use (1)
#1search_nodes
json{
  "limit": 10,
  "query": "multi-view consistency"
}
#4Assistant

thinking: The corpus tracks the concept through (a) the OpenQuestion node and (b) explicit fact assertions. Pull the OpenQuestion neighbours and search facts.

The concept lives as an OpenQuestion plus a related Concept (Multi-View Stereo). I'll trace via the question's neighbours and a fact search.

Tool use (4)
#1node_context
json{
  "limit": 30,
  "node_id": "OpenQuestion:multi-view-consistency"
}
#2search_facts
json{
  "limit": 12,
  "query": "multi-view consistency"
}
#3timeline
json{
  "limit": 20,
  "query": "multi-view consistency"
}
#4wiki_page
json{
  "node_id": "OpenQuestion:multi-view-consistency"
}
#5Assistant

thinking: I have enough anchors to draw a five-stage timeline. Compose it.

Concept-evolution timeline for 'Multi-View Consistency' across the corpus:

  1. 2020-03 - NeRF (arxiv-2003-08934) Establishes photometric multi-view consistency as the implicit optimisation signal. Per-scene loss, no other supervision.
  2. 2021-06 - NeuS (arxiv-2106-10689) Lifts the property from 'pixel agreement' to 'view-consistent SDF zero level set'. Geometry is the thing that should be consistent, not just radiance.
  3. 2021-08 - DROID-SLAM (arxiv-2108-10869) Operationalises the property under aggressive motion: deep bundle adjustment on learned features remains the most robust pose engine on EuRoC / TartanAir.
  4. 2023-08 / 2023-12 - 3DGS (arxiv-2308-04079) and MonoGS (arxiv-2312-06741) Splatting takes over view synthesis and SLAM but evaluates only on quasi-static / well-behaved trajectories. The property is assumed, not stress-tested.
  5. 2024-03 - 2DGS (arxiv-2403-17888) and MVSplat (arxiv-2403-14627) Two opposing moves on the same gap. 2DGS bakes view-consistency into the primitive geometry (oriented disk). MVSplat raises the question explicitly: feed-forward prediction inherits the consistency gap from sparse views.

Open question: the gap is the fast-motion regime. None of the splatting-SLAM papers report numbers on EuRoC/TartanAir, and no feed-forward GS method has been stress-tested against multi-view-inconsistent inputs. See questions/multi-view-consistency.md.

#6User

You're tracing this through GS papers, but 'multi-view consistency' is a term that predates GS by more than a decade — classical MVS used it constantly. Search the pre-NeRF era and the early NeRF era too. I want to see if the splatting authors are using the same definition the MVS people were using.

#7Assistant

thinking: Fair. The corpus has at least DeepV2D (arxiv-1812-04605) and MVS-Texturing-era work from 2016 — let me search for the MVS lineage. There should also be UNISURF (2021) and NeuS (2021) which sit at the implicit-surface / radiance-field boundary.

Right — the classical MVS literature owns the term. Broadening the search.

Tool use (4)
#1search_nodes
json{
  "limit": 10,
  "query": "multi-view stereo consistency"
}
#2wiki_page
json{
  "node_id": "Paper:arxiv-1607-02565"
}
#3wiki_page
json{
  "node_id": "Paper:arxiv-2106-10689"
}
#4wiki_page
json{
  "node_id": "Paper:arxiv-1812-04605"
}
#8Assistant

thinking: Three distinct operationalisations are emerging: (1) COLMAP-style photometric+geometric pair check (depth/intensity reprojection), (2) DeepV2D-style differentiable photometric warping loss, (3) NeuS-style 'shared SDF surface' which is by-construction consistency. NeRF sits closer to (2) — the implicit photometric loss across views. 3DGS+MonoGS inherit (2) but never operationalise it as a check; they assume per-scene optimisation will produce it. Let me extend the timeline backward.

I have three pre-GS anchors now. COLMAP MVS (2016) operationalises the term explicitly as a per-pair pixel check. DeepV2D (2018) makes that check differentiable as a photometric warping loss. NeuS (2021) reframes it as a geometric property of a shared SDF. That's the lineage the GS papers walked into.

#9User

Good. So is there a single definition the corpus agrees on, or are these all incompatible operationalizations of different things people happen to call by the same name?

#10Assistant

thinking: They're not the same. Three genuinely distinct definitions: (a) photometric reprojection consistency (COLMAP, DeepV2D, NeRF photometric loss), (b) geometric reprojection / depth consistency (COLMAP's forward-backward check, DROID-SLAM bundle adjustment residual), (c) surface consistency / shared zero level set (NeuS, UNISURF, 2DGS's view-consistent disk argument). 3DGS optimises (a) implicitly and silently fails on (b) and (c) — that's why SuGaR and 2DGS exist as patches. Let me pull UNISURF to confirm the surface-consistency thread.

They're not the same. There are at least three distinct things people call 'multi-view consistency'. Let me confirm against one more paper.

Tool use (2)
#1wiki_page
json{
  "node_id": "Paper:arxiv-2104-10078"
}
#2search_facts
json{
  "limit": 8,
  "query": "multi-view consistency photometric geometric surface"
}
#11Assistant

thinking: Now the answer is clean: three operationalisations, no agreement. Final timeline should start at 2016 (COLMAP) and end 2024 (2DGS / MVSplat), grouped by which definition each paper picks up.

No, there's no single corpus-wide definition. Three distinct operationalisations:

(1) Photometric reprojection consistency — the radiance at a 3D point matches across all views.

  • COLMAP MVS (2016, arxiv-1607-02565): explicit per-pixel NCC + forward-backward depth check.
  • DeepV2D (2018, arxiv-1812-04605): differentiable photometric warping loss across video frames.
  • NeRF (2020, arxiv-2003-08934): implicit — the per-scene photometric loss IS the consistency objective. No explicit check.
  • 3DGS (2023, arxiv-2308-04079): inherits NeRF's photometric loss; no explicit check; assumed to hold after convergence.

(2) Geometric / depth consistency — the depth at a 3D point matches across all views (a stronger property than (1) — radiance can match while depth doesn't, e.g. specular surfaces).

  • COLMAP MVS: forward-backward depth reprojection error in pixels.
  • DROID-SLAM (2021, arxiv-2108-10869): deep bundle adjustment residual on learned correlation features — the production version of (2) under aggressive motion.

(3) Surface / topology consistency — there is a single shared surface every view refers to; consistency is by construction, not measured.

  • NeuS (2021, arxiv-2106-10689), UNISURF (2021, arxiv-2104-10078): shared SDF zero level set.
  • 2DGS (2024, arxiv-2403-17888): same idea at primitive level — each primitive is a single oriented disk, the same plane seen from every camera.

The definitions are not equivalent. (3) implies (2) which implies (1); the converses are false. A scene can be photometrically consistent (1) on a specular surface while being geometrically inconsistent (2), because radiance equality doesn't pin down depth. A method that achieves (1) without checking (2) or (3) can produce a multi-view-consistent render over a geometrically wrong scene — that's the gap the 3DGS-as-SLAM-map papers fall into. The open question (multi-view-consistency under fast motion) is really three questions: which definition matters for which downstream task, and whether per-scene splat optimisation can hit (2) or (3) without explicit supervision.

Subagent sessions

No subagent transcripts attached.

Redacted preview

Traced 'Multi-View Consistency' through the corpus as a five-stage timeline anchored to NeRF (2020), NeuS (2021), DROID-SLAM (2021), 3DGS / MonoGS (2023), and 2DGS / MVSplat (2024). The researcher redirected the scope to include the pre-NeRF MVS era, so the agent broadened to COLMAP MVS (2016), DeepV2D (2018), and UNISURF (2021), then compared definitions across papers and concluded there is no single corpus-wide definition: three distinct operationalisations (photometric reprojection, geometric depth consistency, shared-surface topology) are bundled under one name, and they are not equivalent.