Evidence

Claim to evidence map for Metriplane.

This page maps public claims to concrete release evidence. Each claim should be inspectable through the repository, release archive, evidence package, or reproduction path.

Release evidence Verified release values
580 tests passed evidence/paper_v2_0/test_output.txt
DOI archived on Zenodo 10.5281/zenodo.20736619
6 physical events atlas_assembly_cell_run.txt
1 incident atlas_assembly_cell_run.txt
35.0s missing-tool delay cell_truth_report.md
pass=true bundle verification bundle_verify.txt
PASS generated regression test regression_test.json

Table

Every claim has a verification path and a boundary.

Claim Evidence How to verify Boundary
v0.2.0 is archived as a public release GitHub release and Zenodo DOI 10.5281/zenodo.20736619 Open the v0.2.0 release and Zenodo record. SoftwareX acceptance or peer review is not claimed.
The release gate passed locally evidence/paper_v2_0/test_output.txt reports 580 passed. Run PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 .venv/bin/python -m pytest -q. Captured in one local release environment.
The camera-free replay is deterministic deterministic_replay.txt reports pass=true, 0.0 cm position difference, and 0 event mismatches. Run RUNS=evidence/paper_v2_0/runs ./tools/mp.sh deterministic-replay datasets/demo/session_001.jsonl. Checked-in demo session only; no live-camera claim.
The assembly-cell replay produces one incident atlas_assembly_cell_run.txt reports events=6 and incidents=1. Run metriplane atlas run with the assembly_cell domain pack. One deterministic assembly-cell domain pack and replay.
INC-0001 is explainable as a Cell Truth Report cell_truth_report.md records a 35.0 s wait for torque_driver_1. Inspect evidence/paper_v2_0/atlas_run/cell_truth_report.md. Derived from replayed planar state, not raw-video judgement.
The incident is packaged as portable evidence INC-0001.zip contains manifest, checksums, incident, timeline, report, and replay command. Inspect artifacts/INC-0001_zip_listing.txt and the bundle manifest. Local content/checksum verification; not malware scanning.
The evidence bundle verifies bundle_verify.txt reports JSON with pass=true and no errors. Run metriplane atlas bundle verify evidence/paper_v2_0/atlas_run/evidence_bundles/INC-0001.zip. Verifies this bundle schema/content locally.
The incident becomes a regression test regression_test.json reports pass=true for missing_tool_caused_delay_INC-0001. Run metriplane atlas test evidence/paper_v2_0/atlas_run/regression_tests/INC-0001.yaml --json. Regression covers this generated incident expectation.
The project is observe-only and bounded release claim boundaries, Atlas docs, and explicit non-claims. Read docs/release_v0_2_claims.md and docs/atlas/README.md. No robot control, safety certification, quality approval, marker-free tracking, or production validation.

Primary artifacts

Files to inspect in the release evidence package.

These paths are in the Metriplane source repository under evidence/paper_v2_0.

Artifact

evidence/paper_v2_0/test_output.txt

Artifact

evidence/paper_v2_0/logs/deterministic_replay.txt

Artifact

evidence/paper_v2_0/logs/atlas_assembly_cell_run.txt

Artifact

evidence/paper_v2_0/logs/bundle_verify.txt

Artifact

evidence/paper_v2_0/logs/regression_test.json

Artifact

evidence/paper_v2_0/atlas_run/cell_truth_report.md

Artifact

evidence/paper_v2_0/atlas_run/evidence_bundles/INC-0001.zip

Artifact

evidence/paper_v2_0/atlas_run/regression_tests/INC-0001.yaml

Boundaries

The evidence is useful because the claims are narrow.

The release package demonstrates a bounded, reproducible workflow for a replayed assembly-cell incident. It does not claim a production factory deployment or a safety-certified decision system.

Observe-only Planar/tagged-asset scoped Camera-free replay path No robot or machine controlNo safety certificationNo quality-release approvalNo people recognitionNo marker-free tracking claimNo full 3D reconstruction claimNo production-factory validationNo factory-wide deployment readiness

Next step

Challenge the evidence map.

The best feedback is concrete: missing source, weak claim, unclear command, or a boundary that should be sharper.