summaryrefslogtreecommitdiff
path: root/docs/design/crdt-design.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/design/crdt-design.md')
-rw-r--r--docs/design/crdt-design.md251
1 files changed, 0 insertions, 251 deletions
diff --git a/docs/design/crdt-design.md b/docs/design/crdt-design.md
deleted file mode 100644
index b98c5ed..0000000
--- a/docs/design/crdt-design.md
+++ /dev/null
@@ -1,251 +0,0 @@
-# CRDT Design for Music Notation Grid
-
-## Overview
-
-This document captures the design considerations for implementing a CRDT-based state management system for a music notation grid that supports rhythmic subdivisions (e.g., triplets, quintuplets).
-
-**Last Updated:** 2025-11-14
-
-## Domain Context
-
-- **Grid Structure:** Rows of cells representing rhythmic divisions
-- **Base Unit:** Cells represent musical durations (e.g., quarter notes)
-- **Subdivisions:** Replacing N cells with M cells of different duration (e.g., 4 eighth notes → 3 eighth note triplets)
-- **Critical Constraint:** Total rhythmic duration of each row must remain constant (width preservation)
-
-## Why Not Use Automerge/JSON CRDT?
-
-### Problems with Generic JSON CRDTs
-
-1. **Invariant Enforcement:** Automerge doesn't know about width constraints
- - Concurrent edits might violate total duration
- - No way to "reject" a CRDT merge after the fact
- - Can't undo without breaking causality
-
-2. **Wrong Abstraction Level:**
- - Automerge provides list/map primitives
- - We need rhythmic subdivision semantics
- - Like using text CRDT for spreadsheet formulas
-
-3. **Validation After Merge:**
- ```typescript
- const doc = Automerge.merge(docA, docB);
- if (!isValidWidth(doc.grid.rows[0])) {
- // Now what? Can't reject, can't undo easily
- }
- ```
-
-## Implemented Approach: Operation-Based CRDT with Branch-Based Conflict Resolution
-
-### Core Principles
-
-1. **Cell IDs instead of indices** - stable references across replicas using deterministic `DerivedId`
-2. **Subdivide = replace range** - clear semantics that preserve duration
-3. **Branch-based conflict detection** - conflicts detected via vector clocks, resolved explicitly by users
-4. **Deterministic realization** - each branch produces consistent state from its operation sequence
-5. **No fractional indexing in operations** - positions computed during `realize()`, not stored
-
-### Type Definitions (Rust Implementation)
-
-```rust
-// Deterministic cell ID derived from operation UUID
-pub struct DerivedId {
- id: Uuid, // Operation UUID (cloned, can be interned later)
- tag: &'static str, // "grid", "row", or "cell"
- index: usize, // Index within the operation's scope
-}
-
-pub struct Cell {
- id: DerivedId,
- // Duration and other properties to be added
-}
-
-pub struct Row {
- id: DerivedId,
- cells: Vec<Cell>,
-}
-
-pub struct Grid {
- id: DerivedId,
- rows: Vec<Row>,
-}
-
-pub enum OpPayload {
- CreateGrid {
- rows: usize,
- base_cells_per_row: usize,
- // Cell IDs are derived deterministically: op.id.derive_id("cell", idx)
- },
-
- ChangeSubdivisions {
- grid_id: DerivedId,
- row_id: DerivedId,
- start_cell_id: DerivedId, // First cell in range
- end_cell_id: DerivedId, // Last cell in range (inclusive)
- subdivisions: usize, // Number of new cells
- // New cell IDs derived as: op.id.derive_id("cell", idx)
- },
-}
-
-pub struct Op {
- id: Uuid, // Unique operation ID
- clock: VectorClock, // For causality tracking
- payload: OpPayload,
-}
-```
-
-### Key Design Decisions
-
-#### 1. Deterministic Cell IDs (No Fractional Indexing)
-
-**Decision:** Cell IDs are derived deterministically from the operation ID and index, not stored explicitly.
-
-```rust
-// Creating a grid operation
-op_id = Uuid::now_v7()
-cells = (0..16).map(|i| op_id.derive_id("cell", i))
-// Produces: [uuid:cell:0, uuid:cell:1, ..., uuid:cell:15]
-```
-
-**Why no fractional indexing in operations?**
-- With branch-based conflict resolution, operations within a branch are sequential, not concurrent
-- Each branch produces a deterministic cell order from its operation sequence
-- Fractional positions would be regenerated on every `realize()` anyway
-- Conflicts between branches are detected and resolved explicitly, not merged automatically
-- Simpler operation structure and smaller serialization size
-
-**Positions are computed during realization** if needed for UI, not stored in operations.
-
-#### 2. Branch-Based Conflict Resolution (Not Tombstones)
-
-**Decision:** Currently using `Vec::splice` to replace cell ranges. Future conflict detection will use vector clocks.
-
-Instead of making operations commutative with tombstones, we:
-1. Apply operations in causal order within each branch
-2. Detect divergent branches using vector clocks
-3. Show conflicts to users when branches have incompatible operations
-4. Let users choose resolution strategy
-
-**Benefits:**
-- Simpler operation semantics (just replace this range)
-- No accumulated tombstones taking up memory
-- Conflicts are explicit, not silently resolved
-- User maintains control over musical intent
-
-**Trade-off:** Requires conflict resolution UI, but musical notation benefits from explicit user control.
-
-#### 3. Derived IDs for Memory Efficiency
-
-**Current:** `DerivedId` clones the UUID for simplicity (16 bytes per ID)
-
-**Future optimization:** Intern UUIDs on the `Doc` and use `Arc<Uuid>` in `DerivedId`
-- Each operation's UUID appears in ~10-100 derived IDs (grid, rows, cells)
-- Interning saves 16 bytes × number of derived IDs per unique UUID
-- Cleanup not needed: same UUIDs reused on every `realize()`
-
-## Conflict Resolution: Branch-Based Approach
-
-**Decision:** Use branch-based conflict detection as described in `branch-based-merge.md`
-
-### How It Works
-
-1. **Track causality with vector clocks**
- - Each operation includes a vector clock
- - Determine if operations are concurrent or causally ordered
-
-2. **Detect divergent branches**
- ```rust
- if !clock_a.happens_before(&clock_b) && !clock_b.happens_before(&clock_a) {
- // Operations are concurrent - branches have diverged
- }
- ```
-
-3. **Reconstruct each branch's view**
- - Apply operations in causal order for Branch A
- - Apply operations in causal order for Branch B
- - Show user both interpretations
-
-4. **Let user resolve conflicts**
- - Smart detection: overlapping cell ranges, duration mismatches, etc.
- - Offer automatic resolution when safe (non-overlapping edits)
- - Require user choice when ambiguous
-
-### Why This Works Without Commutativity
-
-Traditional CRDTs require operations to commute (same result regardless of order). We don't:
-
-**Within a branch:** Operations are sequential, not concurrent
-- Actor A's operations build on each other causally
-- Applying them in order produces consistent state
-
-**Between branches:** Conflicts are explicit
-- Actor A: subdivide cells 0-3 into triplets
-- Actor B: subdivide cells 2-5 into quintuplets
-- These overlap → conflict → show user both branches
-
-**Benefits:**
-- Simpler operation semantics (no need to make everything commute)
-- Preserves user intent from each branch
-- Musical notation benefits from explicit control
-- Can offer smart automatic resolution for non-conflicting edits
-
-## Implementation Status
-
-✅ **Completed:**
-- Vector clock implementation for causality tracking
-- Operation-based `Doc` structure with operation log
-- `DerivedId` system for deterministic cell IDs
-- `CreateGrid` operation with deterministic ID generation
-- `ChangeSubdivisions` operation with range-based cell replacement
-- Basic `realize()` to reconstruct `RealizedDoc` from operations
-- Error handling for missing grids/rows/cells
-
-🚧 **In Progress:**
-- Conflict detection using vector clocks
-- Branch reconstruction and comparison
-- Smart conflict identification (overlapping ranges, duration mismatches)
-
-📋 **Next Steps:**
-- Add duration tracking to cells
-- Implement validation that subdivisions preserve row duration
-- Detect conflicting concurrent operations
-- Build per-branch state views
-- Design conflict resolution UI/API
-- Add cell property operations (pitch, velocity, etc.)
-- Snapshot and compaction strategy
-- Consider Hybrid Logical Clocks for wall-clock ordering
-
-## Open Questions
-
-1. **Where does validation belong?**
- - In `Op::apply()` to fail fast?
- - Separate validation pass after `realize()`?
- - During conflict detection only?
-
-2. **Cell property model:**
- - Store all properties in `Cell` struct?
- - Separate operations for each property type?
- - How to handle property conflicts?
-
-3. **Snapshot strategy:**
- - When to create snapshots for performance?
- - How to compact operation history?
- - Can we safely garbage collect old operations?
-
-4. **Server-side validation:**
- - Trust clients or validate operations server-side?
- - How to handle invalid operations from old clients?
-
-## References
-
-- [Automerge Documentation](https://automerge.org/)
-- [Yjs CRDT](https://docs.yjs.dev/)
-- [Figma's Fractional Indexing](https://www.figma.com/blog/realtime-editing-of-ordered-sequences/)
-- [Hybrid Logical Clocks Paper](https://jaredforsyth.com/posts/hybrid-logical-clocks/)
-
-## Related Code
-
-- **Rust CRDT implementation:** `/crdt/src/lib.rs`
-- **Vector clock:** `/crdt/src/vector_clock.rs`
-- **Branch-based merge design:** `./branch-based-merge.md`
-- **Fractional indexing analysis:** `./fractional-indexing.md` (not used in current implementation)