From eb96b8eba0faf07adc5be4463759ec8b64ddc0e0 Mon Sep 17 00:00:00 2001 From: Josh Kingsley Date: Sun, 23 Nov 2025 17:27:34 +0200 Subject: chore: remove docs --- docs/design/README.md | 76 ----- docs/design/branch-based-merge.md | 594 ------------------------------------- docs/design/crdt-design.md | 251 ---------------- docs/design/fractional-indexing.md | 396 ------------------------- docs/design/musical-model.md | 321 -------------------- 5 files changed, 1638 deletions(-) delete mode 100644 docs/design/README.md delete mode 100644 docs/design/branch-based-merge.md delete mode 100644 docs/design/crdt-design.md delete mode 100644 docs/design/fractional-indexing.md delete mode 100644 docs/design/musical-model.md (limited to 'docs') diff --git a/docs/design/README.md b/docs/design/README.md deleted file mode 100644 index 7486251..0000000 --- a/docs/design/README.md +++ /dev/null @@ -1,76 +0,0 @@ -# CRDT Design Documentation - -This directory contains the design documentation for Notive's collaborative music notation CRDT implementation. - -## Documents - -### [crdt-design.md](./crdt-design.md) -The foundational design document covering: -- Why we need a custom operation-based CRDT (not Automerge/Yjs) -- Core type definitions (Cell, Grid, Operations) -- Domain constraints (duration preservation, subdivisions) -- Initial conflict resolution options -- Related to existing TypeScript implementation - -### [fractional-indexing.md](./fractional-indexing.md) -Deep dive into stable position identifiers: -- Why array indices don't work for concurrent edits -- How fractional indexing provides stable cell references -- Implementation details and examples -- Handling position space exhaustion -- Production considerations - -### [branch-based-merge.md](./branch-based-merge.md) -Advanced conflict detection and resolution strategy: -- Detecting concurrent operations with vector clocks -- Grouping operations into causal branches -- Reconstructing per-actor views of state -- Smart conflict detection (overlapping subdivisions, lost edits, duration mismatches) -- User experience for merge conflicts -- Automatic resolution where safe, user control where ambiguous - -## Reading Order - -For someone new to the project: -1. Start with **crdt-design.md** for context and requirements -2. Read **fractional-indexing.md** to understand cell positioning -3. Review **branch-based-merge.md** for the conflict resolution strategy - -## Implementation Status - -The CRDT is being implemented as a Rust crate in `/crdt`: -- ✅ Vector clock implementation -- ✅ Basic Doc structure with operation log -- ✅ Core domain types (Cell, Row, Grid, DerivedId) -- ✅ CreateGrid operation -- ✅ ChangeSubdivisions operation -- ❌ Fractional indexing (decided not to use in operations) -- ⏳ Conflict detection -- ⏳ Branch-based merge - -## Design Principles - -1. **Causal Consistency** - Operations respect causality via vector clocks -2. **Deterministic Realization** - Each branch produces consistent state from its operation sequence -3. **No Silent Data Loss** - Conflicts are explicit, not silently resolved -4. **Domain Invariants** - Preserve musical constraints (e.g., row duration) -5. **User Control** - Explicit conflict resolution for ambiguous concurrent edits -6. **Simplicity Over Commutativity** - Branch-based resolution instead of forcing all operations to commute - -## Key Architectural Decisions - -- **Operation-based CRDT** (not state-based) for semantic operations -- **Vector clocks** for causality tracking (future: consider Hybrid Logical Clocks) -- **Deterministic IDs** using `DerivedId` (no fractional indexing in operations) -- **Range-based subdivision** instead of tombstones (simpler semantics) -- **Branch-based conflict resolution** rather than automatic commutativity -- **User-driven resolution** for ambiguous concurrent edits - -## Future Work - -- Hybrid Logical Clocks for wall-clock ordering -- Snapshot and compaction strategy -- Server-side operation validation -- Operational transformation for some conflicts -- Multi-grid support and cross-grid operations -- Undo/redo with CRDT semantics diff --git a/docs/design/branch-based-merge.md b/docs/design/branch-based-merge.md deleted file mode 100644 index 85dc58d..0000000 --- a/docs/design/branch-based-merge.md +++ /dev/null @@ -1,594 +0,0 @@ -# Branch-Based Merge with Conflict Detection - -**Last Updated:** 2025-11-13 - -## Overview - -This document describes the approach for handling concurrent operations in the CRDT, using vector clocks to detect causal branches and implementing smart merge logic that automatically resolves safe conflicts while flagging ambiguous ones for user resolution. - -## The Core Challenge - -When two users make concurrent subdivisions to a music notation grid, we need to: -1. Apply both operations (CRDT convergence requirement) -2. Detect when operations conflict (overlapping edits, duration violations) -3. Present conflicts to users in an understandable way -4. Preserve both users' intent without silent data loss - -## Detecting Concurrent Operations - -Vector clocks allow us to determine if two operations are concurrent (neither causally depends on the other): - -```rust -impl VectorClock { - pub fn is_concurrent_with(&self, other: &VectorClock) -> bool { - // Two clocks are concurrent if neither is ≤ the other - // This means each has seen events the other hasn't - match self.partial_cmp(other) { - None => true, // Incomparable = concurrent - Some(Ordering::Equal) => false, - _ => false, - } - } -} -``` - -**Example:** -``` -Alice's clock: {alice: 5, bob: 2} -Bob's clock: {alice: 3, bob: 6} - -Alice has seen 5 of her ops, 2 of Bob's -Bob has seen 3 of Alice's ops, 6 of his own - -Neither clock is ≤ the other → Concurrent! -``` - -## Grouping Operations into Causal Branches - -Operations form a directed acyclic graph (DAG) based on causality: - -```rust -pub struct Branch { - pub tip_clock: VectorClock, - pub ops: Vec, -} - -impl Doc { - /// Detect causal branches in the operation history - pub fn detect_branches(&self) -> Vec { - let mut branches: Vec = vec![]; - - for op in &self.ops { - // Find branches this op extends (causally after) - let extends: Vec = branches.iter() - .enumerate() - .filter(|(_, b)| op.clock > b.tip_clock) - .map(|(i, _)| i) - .collect(); - - match extends.len() { - 0 => { - // New concurrent branch - branches.push(Branch { - tip_clock: op.clock.clone(), - ops: vec![op.clone()], - }); - } - 1 => { - // Extends single branch (linear history) - let branch = &mut branches[extends[0]]; - branch.ops.push(op.clone()); - branch.tip_clock = op.clock.clone(); - } - _ => { - // Merges multiple branches - // This op can see all previous concurrent branches - // Could create a new "merge node" in the DAG - } - } - } - - branches - } -} -``` - -## Reconstructing Actor Views - -Each actor has a causal view of the state at their last operation: - -```rust -impl Doc { - /// Realize the state as a specific actor saw it - pub fn realize_actor_view(&self, actor_id: &Uuid) -> RealizedDoc { - let last_op = self.ops.iter() - .filter(|op| op.clock.get(actor_id) > 0) - .last(); - - if let Some(last_op) = last_op { - let mut realized = RealizedDoc::default(); - - // Apply only operations in this actor's causal past - for op in &self.ops { - if op.clock <= last_op.clock { - op.apply(&mut realized); - } - } - - realized - } else { - RealizedDoc::default() - } - } -} -``` - -**Why this works:** Vector clocks define a partial order. If `op.clock <= actor_last_clock`, then the actor had seen (or could have seen) that operation when they made their edit. - -## Smart Conflict Detection - -Not all concurrent operations conflict! We detect specific problematic patterns: - -### 1. Overlapping Subdivisions - -```rust -fn detect_overlapping_subdivisions(&self) -> Vec { - let mut conflicts = vec![]; - - for (i, op_a) in self.ops.iter().enumerate() { - if let OpPayload::Subdivide { deleted_cell_ids: cells_a, .. } = &op_a.payload { - for op_b in &self.ops[i+1..] { - if let OpPayload::Subdivide { deleted_cell_ids: cells_b, .. } = &op_b.payload { - // Must be concurrent - if !op_a.clock.is_concurrent_with(&op_b.clock) { - continue; - } - - // Check for overlapping target cells - let overlap: Vec<_> = cells_a.iter() - .filter(|c| cells_b.contains(c)) - .cloned() - .collect(); - - if !overlap.is_empty() { - conflicts.push(ConflictDetail { - kind: ConflictKind::OverlappingSubdivision, - ops: vec![op_a.id, op_b.id], - affected_cells: overlap, - }); - } - } - } - } - } - - conflicts -} -``` - -**Example:** -``` -Initial: [A, B, C, D] - -Alice subdivides [A, B] → concurrent → Both apply -Bob subdivides [B, C] → concurrent → Duration mismatch! - -Cell B is deleted by both operations -Result has extra duration from both new cell sets -``` - -### 2. Lost Edits (Edit-Delete Conflict) - -```rust -fn detect_lost_edits(&self) -> Vec { - let mut conflicts = vec![]; - - for (i, op_a) in self.ops.iter().enumerate() { - if let OpPayload::SetCellValue { cell_id, .. } = op_a.payload { - for op_b in &self.ops[i+1..] { - if let OpPayload::Subdivide { deleted_cell_ids, .. } = &op_b.payload { - if op_a.clock.is_concurrent_with(&op_b.clock) - && deleted_cell_ids.contains(&cell_id) { - conflicts.push(ConflictDetail { - kind: ConflictKind::CellValueLostBySubdivision, - ops: vec![op_a.id, op_b.id], - affected_cells: vec![cell_id], - }); - } - } - } - } - } - - conflicts -} -``` - -**Example:** -``` -Alice sets cell B value to "C#" → concurrent -Bob subdivides [B, C] → concurrent - -Alice's edit is "lost" because B is deleted -User should be warned about this -``` - -### 3. Duration Mismatches - -```rust -fn detect_duration_mismatches(&self, realized: &RealizedDoc) -> Vec { - let mut conflicts = vec![]; - - for row in &realized.grid.rows { - let active_cells: Vec<_> = row.cells.iter() - .filter(|c| !c.deleted) - .collect(); - - let actual: f64 = active_cells.iter() - .map(|c| c.duration) - .sum(); - - let expected = row.expected_duration; - - if (actual - expected).abs() > 0.001 { - // Find which operations affected this row - let affecting_ops = self.find_ops_affecting_row(row.id); - - conflicts.push(ConflictDetail { - kind: ConflictKind::DurationMismatch { - expected, - actual, - }, - ops: affecting_ops, - affected_cells: active_cells.iter().map(|c| c.id).collect(), - }); - } - } - - conflicts -} -``` - -## The Smart Merge Algorithm - -```rust -pub enum MergeResult { - Clean(RealizedDoc), - Conflict { - branches: Vec<(Uuid, RealizedDoc)>, // Actor views - conflicts: Vec, - }, -} - -impl Doc { - pub fn smart_merge(&self) -> MergeResult { - // Apply all operations (CRDT convergence) - let mut merged = RealizedDoc::default(); - for op in &self.ops { - op.apply(&mut merged); - } - - // Detect conflicts - let mut conflicts = vec![]; - conflicts.extend(self.detect_overlapping_subdivisions()); - conflicts.extend(self.detect_lost_edits()); - conflicts.extend(self.detect_duration_mismatches(&merged)); - - if conflicts.is_empty() { - // Clean merge - all concurrent ops are compatible - MergeResult::Clean(merged) - } else { - // Reconstruct each actor's view for the merge UI - let actors = self.get_divergent_actors(); - let actor_views = actors.into_iter() - .map(|actor_id| (actor_id, self.realize_actor_view(&actor_id))) - .collect(); - - MergeResult::Conflict { - branches: actor_views, - conflicts, - } - } - } -} -``` - -## Implementation Strategy - -### Realization with Conflict Detection - -When realizing a document: - -1. **Apply all operations** - Every op in the log applies, creating cells and tombstones -2. **Detect conflicts** - Find where concurrent ops interfere (overlapping deletions, duration mismatches) -3. **Pick display winner** - Use tie-breaker (op ID, or current viewer's changes) for showing a working state -4. **Mark conflicted regions** - Track which cells/ranges are involved in conflicts -5. **Enable resolution** - User can drill into conflicts to see actor views and choose final version - -```rust -pub struct RealizedDoc { - grids: Vec, - conflicts: Vec, // Detected conflicts -} - -pub struct Conflict { - kind: ConflictKind, - affected_cells: Vec, - conflicting_ops: Vec, -} - -impl Doc { - pub fn realize(&self) -> RealizedDoc { - let mut realized = RealizedDoc::default(); - let mut conflicts = vec![]; - - // Apply all ops, tracking conflicts - for op in &self.ops { - if let Err(conflict) = op.apply(&mut realized) { - conflicts.push(conflict); - // Apply arbitrary winner for display - op.apply_with_tiebreaker(&mut realized); - } - } - - realized.conflicts = conflicts; - realized - } -} -``` - -### Arbitrary Winner Selection - -When concurrent ops target the same cells, pick a winner for display purposes: - -**Strategy 1: Deterministic (Op ID)** -```rust -// Lexicographically larger op ID wins -if op1.id > op2.id { - // Apply op1's changes -} -``` - -**Strategy 2: Viewer-based** -```rust -// Current viewer's changes win by default -if op.created_by_actor() == current_viewer { - // Apply this viewer's changes -} else { - // Use fallback (op ID) -} -``` - -This gives each user a familiar view (their changes visible) while still highlighting conflicts. - -## User Experience - -### Auto-Merge (No Conflict) - -``` -Alice subdivides [A, B] → Triplets -Bob subdivides [C, D] → Quintuplets - -No overlap, no conflict -Result: [Triplet1, Triplet2, Triplet3, Quintuplet1, ..., Quintuplet5] -Duration preserved ✓ -``` - -### Conflict Detection UI (Updated) - -**Initial View: Conflicted Region Highlighted** - -``` -Row 1: (Viewing as Alice) -┌───┬───┬───┬───┬───┬───┬───┬───┐ -│ │ │⚠️T1│⚠️T2│⚠️T3│ │ │ │ ⚠️ Conflict detected -└───┴───┴───┴───┴───┴───┴───┴───┘ - └─────────┘ - Click to resolve - -// Alice's changes are shown by default (viewer-based winner) -// Conflicted cells are highlighted with a warning indicator -``` - -**Resolution View: Click to See Actor Versions** - -``` -┌──────────────────────────────────────────────────────────┐ -│ Conflict Resolution: Overlapping Subdivisions │ -├──────────────────────────────────────────────────────────┤ -│ │ -│ Alice and Bob edited cells [B, C, D] concurrently │ -│ │ -│ 👤 Alice's version: │ -│ ┌───┬───┬───┬───┐ │ -│ │ A │T1 │T2 │T3 │ (3 triplets from [B,C,D]) │ -│ └───┴───┴───┴───┘ Duration: 1.0 ✓ │ -│ │ -│ 👤 Bob's version: │ -│ ┌───┬───┬───┬───┬───┬───┐ │ -│ │ A │Q1 │Q2 │Q3 │Q4 │Q5 │ (5 quintuplets from [B,C,D]) │ -│ └───┴───┴───┴───┴───┴───┘ Duration: 1.0 ✓ │ -│ │ -│ [✓ Keep Alice's version ] [ Keep Bob's version ] │ -│ [ Manually merge ] │ -└──────────────────────────────────────────────────────────┘ -``` - -**Key UX Properties:** - -1. **Always shows working state** - Uses arbitrary winner (viewer's changes by default) -2. **Conflicts are visible** - Highlighted regions indicate resolution needed -3. **Non-intrusive** - User can keep working, resolve conflicts when ready -4. **Context preservation** - Actor views show what each person saw when they edited - -## Resolution Strategies - -### 1. Keep One Version -Create a new operation that: -- Undoes the unwanted changes (marks those cells as deleted) -- Re-applies one actor's version as a new operation -- Preserves causality (new operation depends on both branches) - -### 2. Keep Both as Separate Grids -- Fork the grid into two separate grids -- Each preserves one actor's version -- Allows manual copying between grids later - -### 3. Manual Resolution -- UI allows editing the merged (invalid) state -- Create operations to fix duration mismatches -- New operations causally depend on all previous ops - -## Key Design Properties - -1. **No Silent Data Loss** - Both operations always apply; conflicts are explicit -2. **Automatic Safe Merges** - Non-overlapping concurrent edits merge cleanly -3. **Deterministic Convergence** - All replicas converge to same state (including conflicts) -4. **User Control** - Ambiguous cases are presented to users, not arbitrarily resolved -5. **Causal Consistency** - Vector clocks ensure we can reconstruct any actor's view - -## Implementation Requirements - -### Data Structures - -```rust -// Cell with tombstone tracking -pub struct Cell { - id: DerivedId, - duration: Ratio, - created_by: Uuid, // Op that created this cell - deleted_by: Option, // Op that deleted this cell (if any) -} - -// Realized state includes conflicts -pub struct RealizedDoc { - grids: Vec, - conflicts: Vec, - deleted_cells: HashMap, // Quick lookup: cell_id -> deleting_op_id -} - -// Conflict representation -pub struct Conflict { - kind: ConflictKind, - affected_cells: Vec, - conflicting_ops: Vec, -} - -pub enum ConflictKind { - OverlappingSubdivision { op1: Uuid, op2: Uuid }, - DurationMismatch { expected: Ratio, actual: Ratio }, - // ... other conflict types -} -``` - -### Realization Algorithm - -```rust -impl Doc { - pub fn realize(&self) -> RealizedDoc { - let mut realized = RealizedDoc::default(); - - // Step 1: Apply all ops (with tombstones) - for op in &self.ops { - op.apply_with_tombstones(&mut realized); - } - - // Step 2: Detect conflicts - realized.conflicts = self.detect_conflicts(&realized); - - // Step 3: Apply tie-breaker for display - // (Viewer-based or deterministic) - self.apply_conflict_winners(&mut realized); - - realized - } - - fn detect_conflicts(&self, realized: &RealizedDoc) -> Vec { - let mut conflicts = vec![]; - - // Find overlapping concurrent subdivisions - for (i, op_a) in self.ops.iter().enumerate() { - for op_b in &self.ops[i+1..] { - if op_a.clock.is_concurrent_with(&op_b.clock) { - if let Some(conflict) = self.check_subdivision_conflict(op_a, op_b) { - conflicts.push(conflict); - } - } - } - } - - // Check duration invariants - for grid in &realized.grids { - for row in &grid.rows { - if let Some(conflict) = self.check_duration_conflict(row) { - conflicts.push(conflict); - } - } - } - - conflicts - } -} -``` - -### Actor View Reconstruction - -For conflict resolution UI: - -```rust -impl Doc { - /// Realize the document as a specific actor saw it - pub fn realize_actor_view(&self, actor_id: &Uuid) -> RealizedDoc { - let last_actor_op = self.ops.iter() - .filter(|op| op.get_actor() == actor_id) - .last(); - - if let Some(last_op) = last_actor_op { - let mut realized = RealizedDoc::default(); - - // Only apply ops in this actor's causal past - for op in &self.ops { - if op.clock <= last_op.clock { - op.apply_with_tombstones(&mut realized); - } - } - - realized - } else { - RealizedDoc::default() - } - } - - /// Get actor views for all participants in a conflict - pub fn get_conflict_actor_views(&self, conflict: &Conflict) -> HashMap { - conflict.conflicting_ops.iter() - .map(|op_id| { - let op = self.ops.iter().find(|o| o.id == *op_id).unwrap(); - let actor = op.get_actor(); - (actor, self.realize_actor_view(&actor)) - }) - .collect() - } -} -``` - -## Implementation Notes - -- **Store all operations** - Even if they create conflicts, preserve in log -- **Tombstones enable cascading ops** - Later ops can reference cells created by conflicting ops -- **Conflicts tracked separately** - Not part of cell structure, computed during realize() -- **Resolution creates new ops** - "Choose Alice's version" becomes part of history -- **Viewer-based display** - Each user sees their changes by default, conflicts highlighted -- **Deterministic fallback** - When no viewer context, use op ID for tie-breaking - -## Future Enhancements - -- **Operational Transformation** - Automatically adjust positions in some cases -- **Three-way Merge** - Show common ancestor state in conflict UI -- **Conflict Clustering** - Group related conflicts (e.g., same row) -- **Suggested Resolutions** - AI/heuristic-based merge suggestions - -## References - -- See `crdt-design.md` for overall CRDT architecture -- See `fractional-indexing.md` for position semantics diff --git a/docs/design/crdt-design.md b/docs/design/crdt-design.md deleted file mode 100644 index b98c5ed..0000000 --- a/docs/design/crdt-design.md +++ /dev/null @@ -1,251 +0,0 @@ -# CRDT Design for Music Notation Grid - -## Overview - -This document captures the design considerations for implementing a CRDT-based state management system for a music notation grid that supports rhythmic subdivisions (e.g., triplets, quintuplets). - -**Last Updated:** 2025-11-14 - -## Domain Context - -- **Grid Structure:** Rows of cells representing rhythmic divisions -- **Base Unit:** Cells represent musical durations (e.g., quarter notes) -- **Subdivisions:** Replacing N cells with M cells of different duration (e.g., 4 eighth notes → 3 eighth note triplets) -- **Critical Constraint:** Total rhythmic duration of each row must remain constant (width preservation) - -## Why Not Use Automerge/JSON CRDT? - -### Problems with Generic JSON CRDTs - -1. **Invariant Enforcement:** Automerge doesn't know about width constraints - - Concurrent edits might violate total duration - - No way to "reject" a CRDT merge after the fact - - Can't undo without breaking causality - -2. **Wrong Abstraction Level:** - - Automerge provides list/map primitives - - We need rhythmic subdivision semantics - - Like using text CRDT for spreadsheet formulas - -3. **Validation After Merge:** - ```typescript - const doc = Automerge.merge(docA, docB); - if (!isValidWidth(doc.grid.rows[0])) { - // Now what? Can't reject, can't undo easily - } - ``` - -## Implemented Approach: Operation-Based CRDT with Branch-Based Conflict Resolution - -### Core Principles - -1. **Cell IDs instead of indices** - stable references across replicas using deterministic `DerivedId` -2. **Subdivide = replace range** - clear semantics that preserve duration -3. **Branch-based conflict detection** - conflicts detected via vector clocks, resolved explicitly by users -4. **Deterministic realization** - each branch produces consistent state from its operation sequence -5. **No fractional indexing in operations** - positions computed during `realize()`, not stored - -### Type Definitions (Rust Implementation) - -```rust -// Deterministic cell ID derived from operation UUID -pub struct DerivedId { - id: Uuid, // Operation UUID (cloned, can be interned later) - tag: &'static str, // "grid", "row", or "cell" - index: usize, // Index within the operation's scope -} - -pub struct Cell { - id: DerivedId, - // Duration and other properties to be added -} - -pub struct Row { - id: DerivedId, - cells: Vec, -} - -pub struct Grid { - id: DerivedId, - rows: Vec, -} - -pub enum OpPayload { - CreateGrid { - rows: usize, - base_cells_per_row: usize, - // Cell IDs are derived deterministically: op.id.derive_id("cell", idx) - }, - - ChangeSubdivisions { - grid_id: DerivedId, - row_id: DerivedId, - start_cell_id: DerivedId, // First cell in range - end_cell_id: DerivedId, // Last cell in range (inclusive) - subdivisions: usize, // Number of new cells - // New cell IDs derived as: op.id.derive_id("cell", idx) - }, -} - -pub struct Op { - id: Uuid, // Unique operation ID - clock: VectorClock, // For causality tracking - payload: OpPayload, -} -``` - -### Key Design Decisions - -#### 1. Deterministic Cell IDs (No Fractional Indexing) - -**Decision:** Cell IDs are derived deterministically from the operation ID and index, not stored explicitly. - -```rust -// Creating a grid operation -op_id = Uuid::now_v7() -cells = (0..16).map(|i| op_id.derive_id("cell", i)) -// Produces: [uuid:cell:0, uuid:cell:1, ..., uuid:cell:15] -``` - -**Why no fractional indexing in operations?** -- With branch-based conflict resolution, operations within a branch are sequential, not concurrent -- Each branch produces a deterministic cell order from its operation sequence -- Fractional positions would be regenerated on every `realize()` anyway -- Conflicts between branches are detected and resolved explicitly, not merged automatically -- Simpler operation structure and smaller serialization size - -**Positions are computed during realization** if needed for UI, not stored in operations. - -#### 2. Branch-Based Conflict Resolution (Not Tombstones) - -**Decision:** Currently using `Vec::splice` to replace cell ranges. Future conflict detection will use vector clocks. - -Instead of making operations commutative with tombstones, we: -1. Apply operations in causal order within each branch -2. Detect divergent branches using vector clocks -3. Show conflicts to users when branches have incompatible operations -4. Let users choose resolution strategy - -**Benefits:** -- Simpler operation semantics (just replace this range) -- No accumulated tombstones taking up memory -- Conflicts are explicit, not silently resolved -- User maintains control over musical intent - -**Trade-off:** Requires conflict resolution UI, but musical notation benefits from explicit user control. - -#### 3. Derived IDs for Memory Efficiency - -**Current:** `DerivedId` clones the UUID for simplicity (16 bytes per ID) - -**Future optimization:** Intern UUIDs on the `Doc` and use `Arc` in `DerivedId` -- Each operation's UUID appears in ~10-100 derived IDs (grid, rows, cells) -- Interning saves 16 bytes × number of derived IDs per unique UUID -- Cleanup not needed: same UUIDs reused on every `realize()` - -## Conflict Resolution: Branch-Based Approach - -**Decision:** Use branch-based conflict detection as described in `branch-based-merge.md` - -### How It Works - -1. **Track causality with vector clocks** - - Each operation includes a vector clock - - Determine if operations are concurrent or causally ordered - -2. **Detect divergent branches** - ```rust - if !clock_a.happens_before(&clock_b) && !clock_b.happens_before(&clock_a) { - // Operations are concurrent - branches have diverged - } - ``` - -3. **Reconstruct each branch's view** - - Apply operations in causal order for Branch A - - Apply operations in causal order for Branch B - - Show user both interpretations - -4. **Let user resolve conflicts** - - Smart detection: overlapping cell ranges, duration mismatches, etc. - - Offer automatic resolution when safe (non-overlapping edits) - - Require user choice when ambiguous - -### Why This Works Without Commutativity - -Traditional CRDTs require operations to commute (same result regardless of order). We don't: - -**Within a branch:** Operations are sequential, not concurrent -- Actor A's operations build on each other causally -- Applying them in order produces consistent state - -**Between branches:** Conflicts are explicit -- Actor A: subdivide cells 0-3 into triplets -- Actor B: subdivide cells 2-5 into quintuplets -- These overlap → conflict → show user both branches - -**Benefits:** -- Simpler operation semantics (no need to make everything commute) -- Preserves user intent from each branch -- Musical notation benefits from explicit control -- Can offer smart automatic resolution for non-conflicting edits - -## Implementation Status - -✅ **Completed:** -- Vector clock implementation for causality tracking -- Operation-based `Doc` structure with operation log -- `DerivedId` system for deterministic cell IDs -- `CreateGrid` operation with deterministic ID generation -- `ChangeSubdivisions` operation with range-based cell replacement -- Basic `realize()` to reconstruct `RealizedDoc` from operations -- Error handling for missing grids/rows/cells - -🚧 **In Progress:** -- Conflict detection using vector clocks -- Branch reconstruction and comparison -- Smart conflict identification (overlapping ranges, duration mismatches) - -📋 **Next Steps:** -- Add duration tracking to cells -- Implement validation that subdivisions preserve row duration -- Detect conflicting concurrent operations -- Build per-branch state views -- Design conflict resolution UI/API -- Add cell property operations (pitch, velocity, etc.) -- Snapshot and compaction strategy -- Consider Hybrid Logical Clocks for wall-clock ordering - -## Open Questions - -1. **Where does validation belong?** - - In `Op::apply()` to fail fast? - - Separate validation pass after `realize()`? - - During conflict detection only? - -2. **Cell property model:** - - Store all properties in `Cell` struct? - - Separate operations for each property type? - - How to handle property conflicts? - -3. **Snapshot strategy:** - - When to create snapshots for performance? - - How to compact operation history? - - Can we safely garbage collect old operations? - -4. **Server-side validation:** - - Trust clients or validate operations server-side? - - How to handle invalid operations from old clients? - -## References - -- [Automerge Documentation](https://automerge.org/) -- [Yjs CRDT](https://docs.yjs.dev/) -- [Figma's Fractional Indexing](https://www.figma.com/blog/realtime-editing-of-ordered-sequences/) -- [Hybrid Logical Clocks Paper](https://jaredforsyth.com/posts/hybrid-logical-clocks/) - -## Related Code - -- **Rust CRDT implementation:** `/crdt/src/lib.rs` -- **Vector clock:** `/crdt/src/vector_clock.rs` -- **Branch-based merge design:** `./branch-based-merge.md` -- **Fractional indexing analysis:** `./fractional-indexing.md` (not used in current implementation) diff --git a/docs/design/fractional-indexing.md b/docs/design/fractional-indexing.md deleted file mode 100644 index 287e719..0000000 --- a/docs/design/fractional-indexing.md +++ /dev/null @@ -1,396 +0,0 @@ -# Fractional Indexing for Stable Cell Positions - -**Last Updated:** 2025-11-13 - -## Overview - -Fractional indexing provides stable, immutable identifiers for cell positions in an ordered sequence, enabling concurrent insertions without the coordination problems of array indices. - -## The Problem with Array Indices - -Consider this scenario with traditional array indices: - -``` -Initial state: -[0: A, 1: B, 2: C, 3: D] - -Alice subdivides cells 1-2 (B, C) -Bob subdivides cells 2-3 (C, D) - -After Alice's operation: -[0: A, 1: T1, 2: T2, 3: T3, 4: D] -Now "cell 2" means T2 - -After Bob's operation (he thinks "cell 2" is C): -[0: A, 1: T1, 2: T2, 3: T3, 4: Q1, 5: Q2, 6: Q3, 7: D] -But C is gone! Bob's operation is invalid. -``` - -**The problem:** Indices are relative positions that change when the array is modified. Concurrent operations reference different cells by the same index. - -## Fractional Indexing Solution - -Instead of indices, each cell has a **position string** that never changes: - -``` -Initial state: -Cell A: position="a0" -Cell B: position="a1" -Cell C: position="a2" -Cell D: position="a3" - -Alice subdivides [B, C] (position="a1" and "a2") -Bob subdivides [C, D] (position="a2" and "a3") - -Alice creates: - T1: position="a1.2" (between a1 and a2) - T2: position="a1.5" - T3: position="a1.8" - -Bob creates: - Q1: position="a2.2" (between a2 and a3) - Q2: position="a2.5" - Q3: position="a2.8" - -Merged result (sorted by position): -a0 (A), a1 (B, deleted), a1.2 (T1), a1.5 (T2), a1.8 (T3), -a2 (C, deleted), a2.2 (Q1), a2.5 (Q2), a2.8 (Q3), a3 (D, deleted) -``` - -**Key insight:** Position strings are immutable. Operations reference cells by their stable position, not their current index in the array. - -## Position String Format - -Positions are lexicographically sortable strings: - -``` -"a0" < "a0.5" < "a1" < "a1.25" < "a2" -``` - -### Base Structure - -``` -[prefix][integer].[fractional] -``` - -- **Prefix:** Letter(s) for global ordering (e.g., "a", "b", "z") -- **Integer:** Whole number part -- **Fractional:** Optional decimal part for insertions - -### Generating Positions - -```rust -pub struct FractionalIndex(String); - -impl FractionalIndex { - /// Create initial positions for a sequence - pub fn initial_sequence(count: usize) -> Vec { - (0..count) - .map(|i| FractionalIndex(format!("a{}", i))) - .collect() - } - - /// Generate position between two existing positions - pub fn between(before: &str, after: &str) -> Self { - // Simplified version - production needs more robust logic - let before_val = parse_position(before); - let after_val = parse_position(after); - let middle = (before_val + after_val) / 2.0; - - FractionalIndex(format_position(middle)) - } - - /// Generate position before the first element - pub fn before_first(first: &str) -> Self { - // e.g., "a0" → "a-1" - let val = parse_position(first); - FractionalIndex(format_position(val - 1.0)) - } - - /// Generate position after the last element - pub fn after_last(last: &str) -> Self { - // e.g., "a5" → "a6" - let val = parse_position(last); - FractionalIndex(format_position(val + 1.0)) - } -} - -fn parse_position(pos: &str) -> f64 { - // Strip prefix, parse number - let num_part = pos.trim_start_matches(|c: char| c.is_alphabetic()); - num_part.parse().unwrap_or(0.0) -} - -fn format_position(val: f64) -> String { - if val.fract() == 0.0 { - format!("a{}", val as i64) - } else { - format!("a{}", val) - } -} -``` - -### Insertion Examples - -```rust -// Insert between a0 and a1 -FractionalIndex::between("a0", "a1") // → "a0.5" - -// Insert between a0.5 and a1 -FractionalIndex::between("a0.5", "a1") // → "a0.75" - -// Insert between a0.75 and a1 -FractionalIndex::between("a0.75", "a1") // → "a0.875" - -// Insert at beginning -FractionalIndex::before_first("a0") // → "a-1" - -// Insert at end -FractionalIndex::after_last("a3") // → "a4" -``` - -## Subdivision Operation with Positions - -```rust -pub struct Subdivide { - pub deleted_cell_ids: Vec, - pub new_cells: Vec, -} - -pub struct NewCell { - pub id: Uuid, - pub position: String, // Fractional position - pub duration: f64, -} - -impl Subdivide { - /// Create a subdivide operation - pub fn new( - target_cells: &[Cell], - new_durations: Vec, - ) -> Self { - // Find position range for new cells - let first_pos = &target_cells.first().unwrap().position; - let last_pos = &target_cells.last().unwrap().position; - - // Generate evenly spaced positions between first and last - let new_cells = distribute_positions( - first_pos, - last_pos, - new_durations.len(), - ) - .into_iter() - .zip(new_durations) - .map(|(position, duration)| NewCell { - id: Uuid::new_v7(), - position, - duration, - }) - .collect(); - - Subdivide { - deleted_cell_ids: target_cells.iter().map(|c| c.id).collect(), - new_cells, - } - } -} - -fn distribute_positions( - first: &str, - last: &str, - count: usize, -) -> Vec { - let start = parse_position(first); - let end = parse_position(last); - let step = (end - start) / (count as f64 + 1.0); - - (1..=count) - .map(|i| format_position(start + step * i as f64)) - .collect() -} -``` - -**Example:** -```rust -// Subdivide cells at positions a1, a2, a3 into 5 new cells - -// Position range: a1 to a3 -// Generate 5 positions between them: -// a1.33, a1.67, a2.0, a2.33, a2.67 - -let subdivide = Subdivide { - deleted_cell_ids: [id_a1, id_a2, id_a3], - new_cells: [ - NewCell { id: new_id_1, position: "a1.33", duration: 0.15 }, - NewCell { id: new_id_2, position: "a1.67", duration: 0.15 }, - NewCell { id: new_id_3, position: "a2.0", duration: 0.15 }, - NewCell { id: new_id_4, position: "a2.33", duration: 0.15 }, - NewCell { id: new_id_5, position: "a2.67", duration: 0.15 }, - ], -}; -``` - -## Applying Operations - -```rust -impl Op { - fn apply(&self, realized: &mut RealizedDoc) { - match &self.payload { - OpPayload::Subdivide { new_cells, deleted_cell_ids, .. } => { - let row = realized.get_row_mut(self.row_id); - - // Mark deleted cells (tombstone) - for cell_id in deleted_cell_ids { - if let Some(cell) = row.cells.iter_mut().find(|c| c.id == *cell_id) { - cell.deleted = true; - } - } - - // Insert new cells - for new_cell in new_cells { - row.cells.push(Cell { - id: new_cell.id, - position: new_cell.position.clone(), - duration: new_cell.duration, - created_by: self.id, - deleted: false, - }); - } - - // Sort by position for display - row.cells.sort_by(|a, b| a.position.cmp(&b.position)); - } - _ => {} - } - } -} -``` - -## Handling Concurrent Insertions - -Fractional indexing ensures deterministic ordering even with concurrent insertions: - -``` -Initial: [a0: A, a1: B, a2: C] - -Alice inserts between A and B: position = "a0.5" -Bob inserts between A and B: position = "a0.5" - -Both generate the same position! -``` - -**Solution 1: Add randomness** -```rust -fn between_with_jitter(before: &str, after: &str, actor_id: &Uuid) -> String { - let base = between(before, after); - // Add tiny actor-specific offset - let offset = (actor_id.as_bytes()[0] as f64) / 1000000.0; - adjust_position(&base, offset) -} -``` - -**Solution 2: Use timestamp in position** -```rust -fn between_with_timestamp(before: &str, after: &str, timestamp: u64) -> String { - let middle = between(before, after); - format!("{}.{}", middle, timestamp % 1000) -} -``` - -**Solution 3: Accept ties (rely on cell ID for final ordering)** -```rust -impl Cell { - fn compare_for_display(&self, other: &Cell) -> Ordering { - // Primary: position - match self.position.cmp(&other.position) { - Ordering::Equal => { - // Tie-breaker: cell ID (lexicographic) - self.id.cmp(&other.id) - } - other => other, - } - } -} -``` - -## Position Space Exhaustion - -**Problem:** Repeatedly inserting between two positions eventually runs out of precision: - -``` -a0, a0.5, a0.75, a0.875, a0.9375, a0.96875, ... -``` - -**Solutions:** - -### 1. Rebalancing (Offline) -Periodically regenerate positions with even spacing: -```rust -fn rebalance_positions(cells: &mut [Cell]) { - let count = cells.len(); - for (i, cell) in cells.iter_mut().enumerate() { - cell.position = format!("a{}", i); - } -} -``` - -### 2. Multi-level Positions -Use hierarchical structure: -``` -"a0", "a0|b0", "a0|b1", "a1" -``` - -### 3. Arbitrary Precision -Use strings that can grow indefinitely: -``` -"a0", "a0.5", "a0.5.5", "a0.5.5.5", ... -``` - -## Production Implementation - -Consider using existing libraries: - -- **fractional-index** (Rust) - Figma-style implementation -- **string-ordering** - Lexicographic ordering primitives - -Example with `fractional-index`: -```rust -use fractional_index::{FractionalIndex, generate_key_between}; - -// Generate position between two cells -let pos = generate_key_between( - Some(&cell_a.position), - Some(&cell_b.position), -); -``` - -## Key Properties - -1. **Immutable** - Positions never change after creation -2. **Lexicographic** - String comparison gives correct order -3. **Dense** - Can always insert between any two positions -4. **Deterministic** - Same inputs produce same outputs -5. **Conflict-Free** - Concurrent insertions at same position are handled gracefully - -## Trade-offs - -**Pros:** -- No coordination needed for insertions -- Stable references across replicas -- Simple to reason about - -**Cons:** -- Positions can become long strings over time -- Requires periodic rebalancing in heavy-edit scenarios -- Slightly more complex than array indices - -## References - -- [Figma's Blog: Realtime Editing of Ordered Sequences](https://www.figma.com/blog/realtime-editing-of-ordered-sequences/) -- [fractional-index crate](https://crates.io/crates/fractional-index) -- [LSEQ CRDT](https://hal.archives-ouvertes.fr/hal-00921633/document) - -## Related Documents - -- See `crdt-design.md` for overall CRDT architecture -- See `branch-based-merge.md` for conflict detection and resolution diff --git a/docs/design/musical-model.md b/docs/design/musical-model.md deleted file mode 100644 index f7b941a..0000000 --- a/docs/design/musical-model.md +++ /dev/null @@ -1,321 +0,0 @@ -# Musical Model and Duration System - -**Last Updated:** 2025-11-15 - -## Overview - -This document describes how the CRDT represents musical time and rhythm, focusing on the relationship between cells, rows, and musical durations. - -## Core Concepts - -### Row Duration (Normalized) - -All rows have a **normalized duration of 1**. This is the internal representation used by the CRDT for calculations and validation. - -```rust -pub struct Row { - id: DerivedId, - cells: Vec, - expected_duration: Ratio, // Always Ratio::new(1, 1) -} -``` - -### Base Cells and Musical Duration - -The **musical duration** of a row is determined by two factors: - -1. **base_cells_per_row** - How many base cells fit in the normalized row duration of 1 -2. **Base cell musical duration** - What musical duration each base cell represents (e.g., quarter note, eighth note) - -**Example 1: One measure of 4/4 time** -``` -base_cells_per_row = 16 -base cell duration = sixteenth note -row musical duration = 16 sixteenth notes = 1 measure of 4/4 -``` - -**Example 2: Two measures of 4/4 time** -``` -base_cells_per_row = 16 -base cell duration = eighth note -row musical duration = 16 eighth notes = 2 measures of 4/4 -``` - -**Example 3: One measure of 3/4 time** -``` -base_cells_per_row = 12 -base cell duration = sixteenth note -row musical duration = 12 sixteenth notes = 1 measure of 3/4 -``` - -## Cell Duration System - -### Rational Number Representation - -Cell durations are stored as exact ratios using Rust's `num_rational::Ratio`: - -```rust -use num_rational::Ratio; - -pub struct Cell { - id: DerivedId, - duration: Ratio, // Fraction of the normalized row duration -} -``` - -**Why ratios?** -- Musical durations are inherently rational (1/4, 1/8, 1/3, etc.) -- No floating-point rounding errors -- Exact validation that cell durations sum to row duration -- Natural representation for subdivision arithmetic - -### Base Cell Duration Calculation - -For a row with `base_cells_per_row` base cells: - -```rust -let base_cell_duration = Ratio::new(1, base_cells_per_row); - -// Examples: -// base_cells_per_row = 16 → each cell is 1/16 of the row -// base_cells_per_row = 12 → each cell is 1/12 of the row -``` - -### Subdivision Duration Calculation - -When subdividing N cells into M new cells, the total duration is preserved: - -```rust -// Original: N cells, each with duration d -let total_duration = n_cells * cell_duration; - -// After subdivision: M cells -let new_cell_duration = total_duration / m_cells; - -// Example: 4 cells of 1/16 → 3 triplets -// total = 4 * (1/16) = 4/16 = 1/4 -// new = (1/4) / 3 = 1/12 per cell -``` - -**Musical interpretation:** -``` -4 sixteenth notes → 3 eighth-note triplets -Duration: 1/4 of row → 1/4 of row (preserved) -Each new cell: 1/12 of row -``` - -## Duration Invariants - -### Row Duration Preservation - -**Invariant:** The sum of all non-deleted cell durations in a row must always equal the row's expected duration. - -```rust -fn validate_row_duration(row: &Row) -> Result<(), Error> { - let actual: Ratio = row.cells.iter() - .filter(|c| !c.deleted) - .map(|c| c.duration) - .sum(); - - if actual != row.expected_duration { - return Err(Error::DurationMismatch { - expected: row.expected_duration, - actual, - }); - } - - Ok(()) -} -``` - -### Subdivision Preservation - -**Invariant:** A `ChangeSubdivisions` operation must preserve the total duration of the cells it replaces. - -```rust -// Before subdivision -let old_duration: Ratio = old_cells.iter() - .map(|c| c.duration) - .sum(); - -// After subdivision -let new_duration: Ratio = new_cells.iter() - .map(|c| c.duration) - .sum(); - -assert_eq!(old_duration, new_duration); -``` - -## Musical Duration Mapping - -While the CRDT uses normalized durations (fractions of 1), the UI needs to display actual musical durations. - -### Common Mappings - -If base cell = quarter note: -``` -CRDT Duration → Musical Duration -1/1 → whole note -1/2 → half note -1/4 → quarter note -1/8 → eighth note -1/16 → sixteenth note -1/3 → quarter note triplet (3 per half note) -1/6 → eighth note triplet -1/12 → sixteenth note triplet -``` - -If base cell = eighth note: -``` -CRDT Duration → Musical Duration -1/2 → whole note -1/4 → half note -1/8 → quarter note -1/16 → eighth note -1/32 → sixteenth note -``` - -### Converting to Musical Duration - -```rust -pub struct MusicalDuration { - pub note_value: NoteValue, // Quarter, Eighth, etc. - pub dots: u8, // Dotted notes - pub tuplet: Option, // Triplets, quintuplets, etc. -} - -pub enum NoteValue { - Whole, - Half, - Quarter, - Eighth, - Sixteenth, -} - -pub struct Tuplet { - pub notes: u32, // 3 for triplet, 5 for quintuplet - pub in_space_of: u32, // Usually notes - 1 -} - -fn to_musical_duration( - cell_duration: Ratio, - base_cell_musical_duration: NoteValue, -) -> MusicalDuration { - // Implementation depends on base cell duration and ratio - // Example: if base cell is sixteenth note and duration is 1/12, - // that's a sixteenth note triplet - todo!() -} -``` - -## Design Rationale - -### Why Normalized Row Duration? - -**Benefits:** -1. **Simplifies CRDT logic** - All rows have the same expected duration (1) -2. **Musical flexibility** - Same CRDT can represent different time signatures -3. **Clear separation** - CRDT handles ratios, UI handles musical interpretation -4. **Easier validation** - Always check if durations sum to 1 - -**Trade-off:** Requires mapping layer between CRDT and musical display - -### Why Not Store Musical Durations Directly? - -Storing "quarter note" or "eighth note" in the CRDT would: -- Couple the CRDT to Western musical notation -- Make subdivision calculations more complex -- Require context about time signature for validation -- Limit future extensions (non-Western music, variable time signatures) - -The normalized approach keeps the CRDT generic and the musical interpretation separate. - -## Future Considerations - -### Time Signature Changes - -Currently, the time signature is implicit in `base_cells_per_row`. Future enhancements might: -- Store explicit time signature metadata on rows -- Support time signature changes mid-document -- Validate subdivisions against time signature rules - -### Tempo and Absolute Time - -The CRDT currently deals with relative durations. Future work might add: -- Tempo markers (BPM) -- Conversion to absolute time (milliseconds) -- Synchronization with audio playback - -### Non-Standard Divisions - -Some musical contexts require unusual divisions: -- Nested tuplets (triplets of quintuplets) -- Irrational rhythms -- Microtiming adjustments - -The ratio system can represent any rational duration, supporting most musical use cases. - -## References - -- **CRDT Design:** `./crdt-design.md` -- **Conflict Resolution:** `./branch-based-merge.md` -- **Implementation:** `/crdt/src/lib.rs` - -## Examples - -### Example 1: Creating a Row - -```rust -// Create a row with 16 base cells (normalized duration 1) -// Musical interpretation: 16 sixteenth notes = 1 measure of 4/4 -let op = OpPayload::CreateGrid { - rows: 1, - base_cells_per_row: 16, -}; - -// Each cell has duration 1/16 -let cell_duration = Ratio::new(1, 16); -``` - -### Example 2: Subdividing into Triplets - -```rust -// Start with 4 cells, each 1/16 (= 1 quarter note) -let old_cells = vec![ - Cell { duration: Ratio::new(1, 16), .. }, - Cell { duration: Ratio::new(1, 16), .. }, - Cell { duration: Ratio::new(1, 16), .. }, - Cell { duration: Ratio::new(1, 16), .. }, -]; - -// Total duration = 4/16 = 1/4 -let total = Ratio::new(4, 16); - -// Subdivide into 3 triplets -// Each triplet = (1/4) / 3 = 1/12 -let new_cells = vec![ - Cell { duration: Ratio::new(1, 12), .. }, - Cell { duration: Ratio::new(1, 12), .. }, - Cell { duration: Ratio::new(1, 12), .. }, -]; - -// Validation: 3 * (1/12) = 3/12 = 1/4 ✓ -``` - -### Example 3: Concurrent Subdivisions - -```rust -// Initial row: 16 cells of 1/16 each -// Alice subdivides cells [0,1,2,3] into 3 triplets (each 1/12) -// Bob subdivides cells [2,3,4,5] into 5 quintuplets (each 1/20) - -// Conflict: cells 2 and 3 are targeted by both operations -// After both ops apply: -// - Cells 0,1 deleted by Alice -// - Cells 2,3,4,5 deleted by Bob -// - Cells 2,3 deleted by BOTH (conflict!) -// - Row has: original cells 6-15 + Alice's 3 triplets + Bob's 5 quintuplets -// - Duration: 10*(1/16) + 3*(1/12) + 5*(1/20) = incorrect! - -// This is why we need conflict detection -``` -- cgit v1.2.3