Migrating from the Python POC to sdivi-rust
Overview
sdivi-rust is a full reimplementation of the Structural Divergence Indexer in
Rust. It replaces the original Python POC
(structural-divergence-indexer).
sdivi-rust is not backward-compatible with the Python POC’s snapshot files.
The migration involves:
- Installing sdivi-rust and running
sdivi init - Accepting a trend-continuity reset (see Snapshot Schema)
- Copying your
.sdivi/config.tomland.sdivi/boundaries.yamlunchanged - Running
sdivi snapshotto establish a new baseline
What Carries Over
| Item | Status |
|---|---|
.sdivi/config.toml syntax and keys |
Fully compatible — same schema |
.sdivi/boundaries.yaml syntax |
Fully compatible — same schema |
| Per-category threshold overrides | Fully compatible |
NO_COLOR, SDIVI_LOG_LEVEL, SDIVI_WORKERS env vars |
Fully compatible |
| Exit codes 0, 1, 2, 10 | Same semantics |
CLI subcommands (init, snapshot, check, show, diff, trend) |
Same interface |
Boundary management (boundaries infer/ratify/show) |
New in sdivi-rust |
What Changes
Snapshot Schema (breaking)
sdivi-rust uses snapshot_version: "1.0". The Python POC used
snapshot_version: "0.1.0". sdivi-rust does not read the Python POC’s
snapshot files.
Effect: All trend history is lost on migration. sdivi trend will show no
data until two or more sdivi-rust snapshots have accumulated. This is an
intentional clean break (KDD-1) — the Python POC’s snapshots are tool-generated
and trivially regeneratable.
Mitigation: If you need trend continuity, run sdivi snapshot --commit <sha>
against each historical commit before going live. There is no automated
backfill command; use a shell loop.
Exit Code 3 (new)
sdivi-rust exits 3 when all detected languages in the repository lack
tree-sitter grammars (e.g. a repo containing only .xyz files with no
registered adapter). The Python POC had no equivalent.
Pattern Fingerprints
sdivi-rust uses blake3 (keyed hash) for pattern fingerprints. The Python POC
used a different hashing scheme. Fingerprint values from the Python POC’s
snapshots are not comparable to sdivi-rust fingerprints.
Boundary Inference
sdivi boundaries is new in sdivi-rust. The Python POC had no equivalent
subcommand. The .sdivi/boundaries.yaml format is compatible, but the
inference algorithm (native Leiden community detection) is not bit-identical to
the Python POC’s output.
Coupling Topology Metrics
The graph_metrics / coupling_topology field names differ between the Python
POC and sdivi-rust snapshots. sdivi-rust uses:
graph.node_count,graph.edge_count,graph.density,graph.cycle_countpartition.community_count(),partition.modularity,partition.seed
The Python POC used a flat graph_metrics object.
sdivi boundaries ratify Loses YAML Comments
When sdivi boundaries ratify rewrites .sdivi/boundaries.yaml, all YAML
comments are lost. This is a known limitation (KDD-6) accepted for the MVP.
Workaround: Keep explanatory comments in a separate document and reference
them by boundary name. Do not run ratify on a hand-edited file.
See the YAML comment loss section below.
What is NOT Affected
- Manual edits to
.sdivi/config.tomlor.sdivi/boundaries.yaml - The
.sdivi/directory layout (snapshots/,cache/,boundaries.yaml) - The meaning and semantics of threshold values
- The
expiresrequirement on per-category overrides
Migration Steps
# 1. Install sdivi-rust
cargo install sdivi-cli
# 2. Verify config compatibility
sdivi --repo /path/to/your/repo init # rewrites config only if missing
# 3. Take a first baseline snapshot
sdivi snapshot --commit "$(git rev-parse HEAD)"
# 4. Verify the check gate
sdivi check
Comparing Results Against the Python POC
When validating sdivi-rust output against a Python POC baseline on the same repo at the same commit, expect the following metric tolerances:
| Metric | Acceptable variance |
|---|---|
| Modularity | Within 1% |
| Community count | Within ±10% |
| Pattern entropy | Within 5% |
These tolerances exist because the native Leiden port (KDD-2) is verified for
partition quality, not bit identity with the Python leidenalg library. Pattern
entropy can differ slightly because sdivi-rust’s tree-sitter normalisation rules
are stricter than the Python POC’s heuristic walk.
YAML Comment Loss
What changed
sdivi boundaries ratify writes .sdivi/boundaries.yaml programmatically using
serde_yml. All YAML comments are lost whenever ratify overwrites the file.
What you will see
sdivi: warning: '.sdivi/boundaries.yaml' contains YAML comments — comments will be
lost after ratify (see docs/migrating-from-the-python-poc.md)
The command still succeeds (exit 0). The comment-stripped version is written atomically.
Why this happens
Comment-preserving YAML round-trips require a hand-written emitter or an immature crate; neither is acceptable for the MVP quality bar (KDD-6).
Workarounds
- Keep comments in a separate doc and link from boundary names
- Do not run
ratifyon a hand-edited file; useratifyonly on fresh output - Version-control the file — deleted comments can be recovered from git history