From d81577a46e9a579adb4e6062d83aef1f3eb2fb25 Mon Sep 17 00:00:00 2001 From: Josh Kingsley Date: Fri, 14 Nov 2025 00:14:25 +0200 Subject: docs: update design docs --- docs/design/README.md | 26 ++-- docs/design/crdt-design.md | 355 +++++++++++++++++++++------------------------ 2 files changed, 181 insertions(+), 200 deletions(-) (limited to 'docs') diff --git a/docs/design/README.md b/docs/design/README.md index 678addb..7486251 100644 --- a/docs/design/README.md +++ b/docs/design/README.md @@ -41,29 +41,29 @@ For someone new to the project: The CRDT is being implemented as a Rust crate in `/crdt`: - ✅ Vector clock implementation - ✅ Basic Doc structure with operation log -- ⏳ Core domain types (Cell, Row, Grid) -- ⏳ CreateGrid operation -- ⏳ Subdivide operation -- ⏳ Fractional indexing +- ✅ Core domain types (Cell, Row, Grid, DerivedId) +- ✅ CreateGrid operation +- ✅ ChangeSubdivisions operation +- ❌ Fractional indexing (decided not to use in operations) - ⏳ Conflict detection - ⏳ Branch-based merge ## Design Principles -1. **Convergence** - All replicas eventually reach the same state -2. **Commutativity** - Operations can be applied in any order -3. **Idempotency** - Re-applying operations is safe -4. **No Silent Data Loss** - Conflicts are explicit, not silently resolved -5. **Causal Consistency** - Operations respect causality -6. **Domain Invariants** - Preserve musical constraints (e.g., row duration) +1. **Causal Consistency** - Operations respect causality via vector clocks +2. **Deterministic Realization** - Each branch produces consistent state from its operation sequence +3. **No Silent Data Loss** - Conflicts are explicit, not silently resolved +4. **Domain Invariants** - Preserve musical constraints (e.g., row duration) +5. **User Control** - Explicit conflict resolution for ambiguous concurrent edits +6. **Simplicity Over Commutativity** - Branch-based resolution instead of forcing all operations to commute ## Key Architectural Decisions - **Operation-based CRDT** (not state-based) for semantic operations - **Vector clocks** for causality tracking (future: consider Hybrid Logical Clocks) -- **Fractional indexing** for stable positions -- **Tombstones** for deletions (never truly remove cells) -- **Conflict detection** rather than automatic LWW +- **Deterministic IDs** using `DerivedId` (no fractional indexing in operations) +- **Range-based subdivision** instead of tombstones (simpler semantics) +- **Branch-based conflict resolution** rather than automatic commutativity - **User-driven resolution** for ambiguous concurrent edits ## Future Work diff --git a/docs/design/crdt-design.md b/docs/design/crdt-design.md index 5e6dd2d..b98c5ed 100644 --- a/docs/design/crdt-design.md +++ b/docs/design/crdt-design.md @@ -4,7 +4,7 @@ This document captures the design considerations for implementing a CRDT-based state management system for a music notation grid that supports rhythmic subdivisions (e.g., triplets, quintuplets). -**Last Updated:** 2025-11-12 +**Last Updated:** 2025-11-14 ## Domain Context @@ -35,227 +35,206 @@ This document captures the design considerations for implementing a CRDT-based s } ``` -## Recommended Approach: Operation-Based CRDT +## Implemented Approach: Operation-Based CRDT with Branch-Based Conflict Resolution ### Core Principles -1. **Cell IDs instead of indices** - stable references across replicas -2. **Subdivide = delete + insert** - clear semantics that preserve duration -3. **Conflict detection** - width violations shown to users, not silently resolved -4. **Operation-level commutativity** - same result regardless of operation order - -### Type Definitions - -```typescript -export type CellId = string; // UUID - -export type Cell = Immutable<{ - id: CellId; - - // Fractional index for positioning (like Figma, Linear) - position: string; // e.g., "a0", "a0.5", "a1" - - // Rhythmic duration this cell represents (e.g., 0.25 = quarter note) - duration: number; - - // Reference to parent if this is a subdivision - parentCellId: CellId | null; - - // CRDT metadata - createdBy: string; // OpId - timestamp: HLCTimestamp; - deleted: boolean; // Tombstone for deletions -}>; - -export type CreateGrid = Immutable<{ - type: "createGrid"; - opId: string; - timestamp: HLCTimestamp; - gridId: string; - rows: number; - baseCellsPerRow: number; - - // Pre-generated cell IDs for initial cells - initialCellIds: CellId[][]; // [row][col] -}>; - -export type Subdivide = Immutable<{ - type: "subdivide"; - opId: string; - timestamp: HLCTimestamp; - gridId: string; - rowIndex: number; - - // Cells to mark as deleted (tombstone) - deletedCellIds: CellId[]; - - // New cells to insert - newCells: Array<{ - id: CellId; - position: string; // Fractional index between neighbors - duration: number; // Must sum to deleted cells' total duration - }>; -}>; -``` +1. **Cell IDs instead of indices** - stable references across replicas using deterministic `DerivedId` +2. **Subdivide = replace range** - clear semantics that preserve duration +3. **Branch-based conflict detection** - conflicts detected via vector clocks, resolved explicitly by users +4. **Deterministic realization** - each branch produces consistent state from its operation sequence +5. **No fractional indexing in operations** - positions computed during `realize()`, not stored -### Key Design Decisions +### Type Definitions (Rust Implementation) -#### 1. Fractional Positioning +```rust +// Deterministic cell ID derived from operation UUID +pub struct DerivedId { + id: Uuid, // Operation UUID (cloned, can be interned later) + tag: &'static str, // "grid", "row", or "cell" + index: usize, // Index within the operation's scope +} -Uses fractional indexing (similar to Figma's approach) to allow insertion between any two cells without reindexing. +pub struct Cell { + id: DerivedId, + // Duration and other properties to be added +} -```typescript -// Example positions: -"a0" // First cell -"a0.5" // Inserted between first and second -"a1" // Second cell -"a1.25" // Inserted between second and third -``` +pub struct Row { + id: DerivedId, + cells: Vec, +} -#### 2. Tombstones for Deletions +pub struct Grid { + id: DerivedId, + rows: Vec, +} + +pub enum OpPayload { + CreateGrid { + rows: usize, + base_cells_per_row: usize, + // Cell IDs are derived deterministically: op.id.derive_id("cell", idx) + }, + + ChangeSubdivisions { + grid_id: DerivedId, + row_id: DerivedId, + start_cell_id: DerivedId, // First cell in range + end_cell_id: DerivedId, // Last cell in range (inclusive) + subdivisions: usize, // Number of new cells + // New cell IDs derived as: op.id.derive_id("cell", idx) + }, +} -Cells are never truly deleted, just marked with `deleted: true`. This ensures: -- Idempotent operations (deleting twice is safe) -- Causal consistency (can tell what existed when) -- Conflict detection (know if concurrent ops targeted same cells) +pub struct Op { + id: Uuid, // Unique operation ID + clock: VectorClock, // For causality tracking + payload: OpPayload, +} +``` -#### 3. Duration Preservation by Construction +### Key Design Decisions -Each `Subdivide` operation ensures: -```typescript -const deletedDuration = deletedCellIds.reduce( - (sum, id) => sum + getCellById(id).duration, - 0 -); +#### 1. Deterministic Cell IDs (No Fractional Indexing) -const insertedDuration = newCells.reduce( - (sum, cell) => sum + cell.duration, - 0 -); +**Decision:** Cell IDs are derived deterministically from the operation ID and index, not stored explicitly. -// Must be equal (enforced when creating operation) -assert(deletedDuration === insertedDuration); +```rust +// Creating a grid operation +op_id = Uuid::now_v7() +cells = (0..16).map(|i| op_id.derive_id("cell", i)) +// Produces: [uuid:cell:0, uuid:cell:1, ..., uuid:cell:15] ``` -## Conflict Resolution Options +**Why no fractional indexing in operations?** +- With branch-based conflict resolution, operations within a branch are sequential, not concurrent +- Each branch produces a deterministic cell order from its operation sequence +- Fractional positions would be regenerated on every `realize()` anyway +- Conflicts between branches are detected and resolved explicitly, not merged automatically +- Simpler operation structure and smaller serialization size -### Option 1: Last-Writer-Wins Per Cell +**Positions are computed during realization** if needed for UI, not stored in operations. -```typescript -function applySubdivide(cells: Cell[], op: Subdivide): Cell[] { - const targetCellsExist = op.deletedCellIds.every(id => - cells.find(c => c.id === id && !c.deleted) - ); +#### 2. Branch-Based Conflict Resolution (Not Tombstones) - if (!targetCellsExist) { - // Target cells already deleted by concurrent op - // This op becomes no-op (idempotent) - return cells; - } +**Decision:** Currently using `Vec::splice` to replace cell ranges. Future conflict detection will use vector clocks. - // Apply deletion + insertion - // ... -} -``` +Instead of making operations commutative with tombstones, we: +1. Apply operations in causal order within each branch +2. Detect divergent branches using vector clocks +3. Show conflicts to users when branches have incompatible operations +4. Let users choose resolution strategy -**Pros:** -- Simple, deterministic -- Guaranteed commutative (same result regardless of order) +**Benefits:** +- Simpler operation semantics (just replace this range) +- No accumulated tombstones taking up memory +- Conflicts are explicit, not silently resolved +- User maintains control over musical intent -**Cons:** -- Silently drops one user's edit -- No way to recover lost intent +**Trade-off:** Requires conflict resolution UI, but musical notation benefits from explicit user control. -### Option 2: Hierarchical Cell Model +#### 3. Derived IDs for Memory Efficiency -Subdivisions create nested structure: +**Current:** `DerivedId` clones the UUID for simplicity (16 bytes per ID) -```typescript -export type Cell = Immutable<{ - id: CellId; - duration: number; +**Future optimization:** Intern UUIDs on the `Doc` and use `Arc` in `DerivedId` +- Each operation's UUID appears in ~10-100 derived IDs (grid, rows, cells) +- Interning saves 16 bytes × number of derived IDs per unique UUID +- Cleanup not needed: same UUIDs reused on every `realize()` - // If subdivided, contains child cells - children: Cell[] | null; +## Conflict Resolution: Branch-Based Approach - // For CRDT: track modifications - createdBy: string; - subdivideOps: Set; -}>; -``` +**Decision:** Use branch-based conflict detection as described in `branch-based-merge.md` -**Pros:** -- Preserves both users' intent -- Can show both subdivisions in UI - -**Cons:** -- Complex merge logic -- Unclear what "subdivide an already-subdivided cell" means -- May require user resolution anyway - -### Option 3: Conflict Detection + User Resolution (RECOMMENDED) - -Allow concurrent operations to apply, but detect conflicts: - -```typescript -function validateRowWidth(row: Row): ValidationResult { - const activeCells = row.cells.filter(c => !c.deleted); - const totalDuration = activeCells.reduce((sum, c) => sum + c.duration, 0); - const expected = getExpectedDuration(row); - - if (totalDuration !== expected) { - return { - valid: false, - conflictingOps: findConflictingOps(row), - affectedCells: findAffectedCells(row), - suggestedResolution: { - // Provide options to user - } - }; - } - - return { valid: true }; -} -``` +### How It Works + +1. **Track causality with vector clocks** + - Each operation includes a vector clock + - Determine if operations are concurrent or causally ordered -**Pros:** -- No silent data loss -- Users understand what happened -- Can choose best resolution per case +2. **Detect divergent branches** + ```rust + if !clock_a.happens_before(&clock_b) && !clock_b.happens_before(&clock_a) { + // Operations are concurrent - branches have diverged + } + ``` -**Cons:** -- Requires UI for conflict resolution -- Temporary invalid states exist -- More complex UX +3. **Reconstruct each branch's view** + - Apply operations in causal order for Branch A + - Apply operations in causal order for Branch B + - Show user both interpretations + +4. **Let user resolve conflicts** + - Smart detection: overlapping cell ranges, duration mismatches, etc. + - Offer automatic resolution when safe (non-overlapping edits) + - Require user choice when ambiguous + +### Why This Works Without Commutativity + +Traditional CRDTs require operations to commute (same result regardless of order). We don't: + +**Within a branch:** Operations are sequential, not concurrent +- Actor A's operations build on each other causally +- Applying them in order produces consistent state + +**Between branches:** Conflicts are explicit +- Actor A: subdivide cells 0-3 into triplets +- Actor B: subdivide cells 2-5 into quintuplets +- These overlap → conflict → show user both branches + +**Benefits:** +- Simpler operation semantics (no need to make everything commute) +- Preserves user intent from each branch +- Musical notation benefits from explicit control +- Can offer smart automatic resolution for non-conflicting edits + +## Implementation Status + +✅ **Completed:** +- Vector clock implementation for causality tracking +- Operation-based `Doc` structure with operation log +- `DerivedId` system for deterministic cell IDs +- `CreateGrid` operation with deterministic ID generation +- `ChangeSubdivisions` operation with range-based cell replacement +- Basic `realize()` to reconstruct `RealizedDoc` from operations +- Error handling for missing grids/rows/cells + +🚧 **In Progress:** +- Conflict detection using vector clocks +- Branch reconstruction and comparison +- Smart conflict identification (overlapping ranges, duration mismatches) + +📋 **Next Steps:** +- Add duration tracking to cells +- Implement validation that subdivisions preserve row duration +- Detect conflicting concurrent operations +- Build per-branch state views +- Design conflict resolution UI/API +- Add cell property operations (pitch, velocity, etc.) +- Snapshot and compaction strategy +- Consider Hybrid Logical Clocks for wall-clock ordering ## Open Questions -1. **Which conflict resolution method to use?** - - Pure LWW? - - Conflict detection + user resolution? - - Hybrid approach? - -2. **Fractional indexing implementation:** - - Use existing library or custom? - - How to handle position exhaustion? +1. **Where does validation belong?** + - In `Op::apply()` to fail fast? + - Separate validation pass after `realize()`? + - During conflict detection only? -3. **Operation validation:** - - Client-side only or server-side too? - - How to handle invalid operations from old clients? +2. **Cell property model:** + - Store all properties in `Cell` struct? + - Separate operations for each property type? + - How to handle property conflicts? -4. **Snapshot strategy:** +3. **Snapshot strategy:** - When to create snapshots for performance? - How to compact operation history? + - Can we safely garbage collect old operations? -## Next Steps - -- [ ] Decide on conflict resolution strategy -- [ ] Implement basic Cell ID system -- [ ] Replace index-based operations with ID-based -- [ ] Add operation IDs and timestamps (HLC) -- [ ] Implement fractional positioning -- [ ] Test commutativity with randomized operation order -- [ ] Design conflict UI (if using Option 3) +4. **Server-side validation:** + - Trust clients or validate operations server-side? + - How to handle invalid operations from old clients? ## References @@ -266,5 +245,7 @@ function validateRowWidth(row: Row): ValidationResult { ## Related Code -- Current implementation: `src/doc/index.ts` -- Grid realization logic: `src/doc/index.ts:87-125` +- **Rust CRDT implementation:** `/crdt/src/lib.rs` +- **Vector clock:** `/crdt/src/vector_clock.rs` +- **Branch-based merge design:** `./branch-based-merge.md` +- **Fractional indexing analysis:** `./fractional-indexing.md` (not used in current implementation) -- cgit v1.2.3