Skip to the content.

Migrating from the Python POC to sdivi-rust

Overview

sdivi-rust is a full reimplementation of the Structural Divergence Indexer in Rust. It replaces the original Python POC (structural-divergence-indexer). sdivi-rust is not backward-compatible with the Python POC’s snapshot files. The migration involves:

  1. Installing sdivi-rust and running sdivi init
  2. Accepting a trend-continuity reset (see Snapshot Schema)
  3. Copying your .sdivi/config.toml and .sdivi/boundaries.yaml unchanged
  4. Running sdivi snapshot to establish a new baseline

What Carries Over

Item Status
.sdivi/config.toml syntax and keys Fully compatible — same schema
.sdivi/boundaries.yaml syntax Fully compatible — same schema
Per-category threshold overrides Fully compatible
NO_COLOR, SDIVI_LOG_LEVEL, SDIVI_WORKERS env vars Fully compatible
Exit codes 0, 1, 2, 10 Same semantics
CLI subcommands (init, snapshot, check, show, diff, trend) Same interface
Boundary management (boundaries infer/ratify/show) New in sdivi-rust

What Changes

Snapshot Schema (breaking)

sdivi-rust uses snapshot_version: "1.0". The Python POC used snapshot_version: "0.1.0". sdivi-rust does not read the Python POC’s snapshot files.

Effect: All trend history is lost on migration. sdivi trend will show no data until two or more sdivi-rust snapshots have accumulated. This is an intentional clean break (KDD-1) — the Python POC’s snapshots are tool-generated and trivially regeneratable.

Mitigation: If you need trend continuity, run sdivi snapshot --commit <sha> against each historical commit before going live. There is no automated backfill command; use a shell loop.

Exit Code 3 (new)

sdivi-rust exits 3 when all detected languages in the repository lack tree-sitter grammars (e.g. a repo containing only .xyz files with no registered adapter). The Python POC had no equivalent.

Pattern Fingerprints

sdivi-rust uses blake3 (keyed hash) for pattern fingerprints. The Python POC used a different hashing scheme. Fingerprint values from the Python POC’s snapshots are not comparable to sdivi-rust fingerprints.

Boundary Inference

sdivi boundaries is new in sdivi-rust. The Python POC had no equivalent subcommand. The .sdivi/boundaries.yaml format is compatible, but the inference algorithm (native Leiden community detection) is not bit-identical to the Python POC’s output.

Coupling Topology Metrics

The graph_metrics / coupling_topology field names differ between the Python POC and sdivi-rust snapshots. sdivi-rust uses:

The Python POC used a flat graph_metrics object.

sdivi boundaries ratify Loses YAML Comments

When sdivi boundaries ratify rewrites .sdivi/boundaries.yaml, all YAML comments are lost. This is a known limitation (KDD-6) accepted for the MVP.

Workaround: Keep explanatory comments in a separate document and reference them by boundary name. Do not run ratify on a hand-edited file.

See the YAML comment loss section below.

What is NOT Affected

Migration Steps

# 1. Install sdivi-rust
cargo install sdivi-cli

# 2. Verify config compatibility
sdivi --repo /path/to/your/repo init   # rewrites config only if missing

# 3. Take a first baseline snapshot
sdivi snapshot --commit "$(git rev-parse HEAD)"

# 4. Verify the check gate
sdivi check

Comparing Results Against the Python POC

When validating sdivi-rust output against a Python POC baseline on the same repo at the same commit, expect the following metric tolerances:

Metric Acceptable variance
Modularity Within 1%
Community count Within ±10%
Pattern entropy Within 5%

These tolerances exist because the native Leiden port (KDD-2) is verified for partition quality, not bit identity with the Python leidenalg library. Pattern entropy can differ slightly because sdivi-rust’s tree-sitter normalisation rules are stricter than the Python POC’s heuristic walk.

YAML Comment Loss

What changed

sdivi boundaries ratify writes .sdivi/boundaries.yaml programmatically using serde_yml. All YAML comments are lost whenever ratify overwrites the file.

What you will see

sdivi: warning: '.sdivi/boundaries.yaml' contains YAML comments — comments will be
lost after ratify (see docs/migrating-from-the-python-poc.md)

The command still succeeds (exit 0). The comment-stripped version is written atomically.

Why this happens

Comment-preserving YAML round-trips require a hand-written emitter or an immature crate; neither is acceptable for the MVP quality bar (KDD-6).

Workarounds