diff options
| author | Josh Kingsley <josh@joshkingsley.me> | 2025-11-13 13:37:19 +0200 |
|---|---|---|
| committer | Josh Kingsley <josh@joshkingsley.me> | 2025-11-13 13:37:19 +0200 |
| commit | 214be3415e27e57036f3c252188d6eb06edc55b7 (patch) | |
| tree | 5182b76a046bcc642f5f19c8609c528a27af9ed3 | |
| parent | bd1fcb295ea4ef921d9e38b44cb27f212d55f644 (diff) | |
docs: add CRDT design notes from Claude
| -rw-r--r-- | docs/design/README.md | 76 | ||||
| -rw-r--r-- | docs/design/branch-based-merge.md | 386 | ||||
| -rw-r--r-- | docs/design/crdt-design.md | 270 | ||||
| -rw-r--r-- | docs/design/fractional-indexing.md | 396 |
4 files changed, 1128 insertions, 0 deletions
diff --git a/docs/design/README.md b/docs/design/README.md new file mode 100644 index 0000000..678addb --- /dev/null +++ b/docs/design/README.md @@ -0,0 +1,76 @@ +# CRDT Design Documentation + +This directory contains the design documentation for Notive's collaborative music notation CRDT implementation. + +## Documents + +### [crdt-design.md](./crdt-design.md) +The foundational design document covering: +- Why we need a custom operation-based CRDT (not Automerge/Yjs) +- Core type definitions (Cell, Grid, Operations) +- Domain constraints (duration preservation, subdivisions) +- Initial conflict resolution options +- Related to existing TypeScript implementation + +### [fractional-indexing.md](./fractional-indexing.md) +Deep dive into stable position identifiers: +- Why array indices don't work for concurrent edits +- How fractional indexing provides stable cell references +- Implementation details and examples +- Handling position space exhaustion +- Production considerations + +### [branch-based-merge.md](./branch-based-merge.md) +Advanced conflict detection and resolution strategy: +- Detecting concurrent operations with vector clocks +- Grouping operations into causal branches +- Reconstructing per-actor views of state +- Smart conflict detection (overlapping subdivisions, lost edits, duration mismatches) +- User experience for merge conflicts +- Automatic resolution where safe, user control where ambiguous + +## Reading Order + +For someone new to the project: +1. Start with **crdt-design.md** for context and requirements +2. Read **fractional-indexing.md** to understand cell positioning +3. Review **branch-based-merge.md** for the conflict resolution strategy + +## Implementation Status + +The CRDT is being implemented as a Rust crate in `/crdt`: +- ✅ Vector clock implementation +- ✅ Basic Doc structure with operation log +- ⏳ Core domain types (Cell, Row, Grid) +- ⏳ CreateGrid operation +- ⏳ Subdivide operation +- ⏳ Fractional indexing +- ⏳ Conflict detection +- ⏳ Branch-based merge + +## Design Principles + +1. **Convergence** - All replicas eventually reach the same state +2. **Commutativity** - Operations can be applied in any order +3. **Idempotency** - Re-applying operations is safe +4. **No Silent Data Loss** - Conflicts are explicit, not silently resolved +5. **Causal Consistency** - Operations respect causality +6. **Domain Invariants** - Preserve musical constraints (e.g., row duration) + +## Key Architectural Decisions + +- **Operation-based CRDT** (not state-based) for semantic operations +- **Vector clocks** for causality tracking (future: consider Hybrid Logical Clocks) +- **Fractional indexing** for stable positions +- **Tombstones** for deletions (never truly remove cells) +- **Conflict detection** rather than automatic LWW +- **User-driven resolution** for ambiguous concurrent edits + +## Future Work + +- Hybrid Logical Clocks for wall-clock ordering +- Snapshot and compaction strategy +- Server-side operation validation +- Operational transformation for some conflicts +- Multi-grid support and cross-grid operations +- Undo/redo with CRDT semantics diff --git a/docs/design/branch-based-merge.md b/docs/design/branch-based-merge.md new file mode 100644 index 0000000..331af5f --- /dev/null +++ b/docs/design/branch-based-merge.md @@ -0,0 +1,386 @@ +# Branch-Based Merge with Conflict Detection + +**Last Updated:** 2025-11-13 + +## Overview + +This document describes the approach for handling concurrent operations in the CRDT, using vector clocks to detect causal branches and implementing smart merge logic that automatically resolves safe conflicts while flagging ambiguous ones for user resolution. + +## The Core Challenge + +When two users make concurrent subdivisions to a music notation grid, we need to: +1. Apply both operations (CRDT convergence requirement) +2. Detect when operations conflict (overlapping edits, duration violations) +3. Present conflicts to users in an understandable way +4. Preserve both users' intent without silent data loss + +## Detecting Concurrent Operations + +Vector clocks allow us to determine if two operations are concurrent (neither causally depends on the other): + +```rust +impl VectorClock { + pub fn is_concurrent_with(&self, other: &VectorClock) -> bool { + // Two clocks are concurrent if neither is ≤ the other + // This means each has seen events the other hasn't + match self.partial_cmp(other) { + None => true, // Incomparable = concurrent + Some(Ordering::Equal) => false, + _ => false, + } + } +} +``` + +**Example:** +``` +Alice's clock: {alice: 5, bob: 2} +Bob's clock: {alice: 3, bob: 6} + +Alice has seen 5 of her ops, 2 of Bob's +Bob has seen 3 of Alice's ops, 6 of his own + +Neither clock is ≤ the other → Concurrent! +``` + +## Grouping Operations into Causal Branches + +Operations form a directed acyclic graph (DAG) based on causality: + +```rust +pub struct Branch { + pub tip_clock: VectorClock, + pub ops: Vec<Op>, +} + +impl Doc { + /// Detect causal branches in the operation history + pub fn detect_branches(&self) -> Vec<Branch> { + let mut branches: Vec<Branch> = vec![]; + + for op in &self.ops { + // Find branches this op extends (causally after) + let extends: Vec<usize> = branches.iter() + .enumerate() + .filter(|(_, b)| op.clock > b.tip_clock) + .map(|(i, _)| i) + .collect(); + + match extends.len() { + 0 => { + // New concurrent branch + branches.push(Branch { + tip_clock: op.clock.clone(), + ops: vec![op.clone()], + }); + } + 1 => { + // Extends single branch (linear history) + let branch = &mut branches[extends[0]]; + branch.ops.push(op.clone()); + branch.tip_clock = op.clock.clone(); + } + _ => { + // Merges multiple branches + // This op can see all previous concurrent branches + // Could create a new "merge node" in the DAG + } + } + } + + branches + } +} +``` + +## Reconstructing Actor Views + +Each actor has a causal view of the state at their last operation: + +```rust +impl Doc { + /// Realize the state as a specific actor saw it + pub fn realize_actor_view(&self, actor_id: &Uuid) -> RealizedDoc { + let last_op = self.ops.iter() + .filter(|op| op.clock.get(actor_id) > 0) + .last(); + + if let Some(last_op) = last_op { + let mut realized = RealizedDoc::default(); + + // Apply only operations in this actor's causal past + for op in &self.ops { + if op.clock <= last_op.clock { + op.apply(&mut realized); + } + } + + realized + } else { + RealizedDoc::default() + } + } +} +``` + +**Why this works:** Vector clocks define a partial order. If `op.clock <= actor_last_clock`, then the actor had seen (or could have seen) that operation when they made their edit. + +## Smart Conflict Detection + +Not all concurrent operations conflict! We detect specific problematic patterns: + +### 1. Overlapping Subdivisions + +```rust +fn detect_overlapping_subdivisions(&self) -> Vec<ConflictDetail> { + let mut conflicts = vec![]; + + for (i, op_a) in self.ops.iter().enumerate() { + if let OpPayload::Subdivide { deleted_cell_ids: cells_a, .. } = &op_a.payload { + for op_b in &self.ops[i+1..] { + if let OpPayload::Subdivide { deleted_cell_ids: cells_b, .. } = &op_b.payload { + // Must be concurrent + if !op_a.clock.is_concurrent_with(&op_b.clock) { + continue; + } + + // Check for overlapping target cells + let overlap: Vec<_> = cells_a.iter() + .filter(|c| cells_b.contains(c)) + .cloned() + .collect(); + + if !overlap.is_empty() { + conflicts.push(ConflictDetail { + kind: ConflictKind::OverlappingSubdivision, + ops: vec![op_a.id, op_b.id], + affected_cells: overlap, + }); + } + } + } + } + } + + conflicts +} +``` + +**Example:** +``` +Initial: [A, B, C, D] + +Alice subdivides [A, B] → concurrent → Both apply +Bob subdivides [B, C] → concurrent → Duration mismatch! + +Cell B is deleted by both operations +Result has extra duration from both new cell sets +``` + +### 2. Lost Edits (Edit-Delete Conflict) + +```rust +fn detect_lost_edits(&self) -> Vec<ConflictDetail> { + let mut conflicts = vec![]; + + for (i, op_a) in self.ops.iter().enumerate() { + if let OpPayload::SetCellValue { cell_id, .. } = op_a.payload { + for op_b in &self.ops[i+1..] { + if let OpPayload::Subdivide { deleted_cell_ids, .. } = &op_b.payload { + if op_a.clock.is_concurrent_with(&op_b.clock) + && deleted_cell_ids.contains(&cell_id) { + conflicts.push(ConflictDetail { + kind: ConflictKind::CellValueLostBySubdivision, + ops: vec![op_a.id, op_b.id], + affected_cells: vec![cell_id], + }); + } + } + } + } + } + + conflicts +} +``` + +**Example:** +``` +Alice sets cell B value to "C#" → concurrent +Bob subdivides [B, C] → concurrent + +Alice's edit is "lost" because B is deleted +User should be warned about this +``` + +### 3. Duration Mismatches + +```rust +fn detect_duration_mismatches(&self, realized: &RealizedDoc) -> Vec<ConflictDetail> { + let mut conflicts = vec![]; + + for row in &realized.grid.rows { + let active_cells: Vec<_> = row.cells.iter() + .filter(|c| !c.deleted) + .collect(); + + let actual: f64 = active_cells.iter() + .map(|c| c.duration) + .sum(); + + let expected = row.expected_duration; + + if (actual - expected).abs() > 0.001 { + // Find which operations affected this row + let affecting_ops = self.find_ops_affecting_row(row.id); + + conflicts.push(ConflictDetail { + kind: ConflictKind::DurationMismatch { + expected, + actual, + }, + ops: affecting_ops, + affected_cells: active_cells.iter().map(|c| c.id).collect(), + }); + } + } + + conflicts +} +``` + +## The Smart Merge Algorithm + +```rust +pub enum MergeResult { + Clean(RealizedDoc), + Conflict { + branches: Vec<(Uuid, RealizedDoc)>, // Actor views + conflicts: Vec<ConflictDetail>, + }, +} + +impl Doc { + pub fn smart_merge(&self) -> MergeResult { + // Apply all operations (CRDT convergence) + let mut merged = RealizedDoc::default(); + for op in &self.ops { + op.apply(&mut merged); + } + + // Detect conflicts + let mut conflicts = vec![]; + conflicts.extend(self.detect_overlapping_subdivisions()); + conflicts.extend(self.detect_lost_edits()); + conflicts.extend(self.detect_duration_mismatches(&merged)); + + if conflicts.is_empty() { + // Clean merge - all concurrent ops are compatible + MergeResult::Clean(merged) + } else { + // Reconstruct each actor's view for the merge UI + let actors = self.get_divergent_actors(); + let actor_views = actors.into_iter() + .map(|actor_id| (actor_id, self.realize_actor_view(&actor_id))) + .collect(); + + MergeResult::Conflict { + branches: actor_views, + conflicts, + } + } + } +} +``` + +## User Experience + +### Auto-Merge (No Conflict) + +``` +Alice subdivides [A, B] → Triplets +Bob subdivides [C, D] → Quintuplets + +No overlap, no conflict +Result: [Triplet1, Triplet2, Triplet3, Quintuplet1, ..., Quintuplet5] +Duration preserved ✓ +``` + +### Conflict Detection UI + +``` +┌──────────────────────────────────────────────────────────┐ +│ Merge Conflict: Overlapping Subdivisions │ +├──────────────────────────────────────────────────────────┤ +│ │ +│ Alice and Bob edited the same cells concurrently │ +│ │ +│ Alice's version (Row 1): │ +│ ┌───┬───┬───┬───┐ │ +│ │T1 │T2 │T3 │ D │ Duration: 1.0 ✓ │ +│ └───┴───┴───┴───┘ │ +│ Triplets from subdividing [A, B, C] │ +│ │ +│ Bob's version (Row 1): │ +│ ┌───┬───┬───┬───┬───┬───┐ │ +│ │ A │Q1 │Q2 │Q3 │Q4 │Q5 │ Duration: 1.0 ✓ │ +│ └───┴───┴───┴───┴───┴───┘ │ +│ Quintuplets from subdividing [B, C, D] │ +│ │ +│ Merged result (both applied): │ +│ ┌───┬───┬───┬───┬───┬───┬───┬───┐ │ +│ │T1 │T2 │T3 │Q1 │Q2 │Q3 │Q4 │Q5 │ Duration: 1.5 ✗ │ +│ └───┴───┴───┴───┴───┴───┴───┴───┘ │ +│ │ +│ [ Keep Alice's version ] │ +│ [ Keep Bob's version ] │ +│ [ Keep both as separate grids ] │ +│ [ Manually resolve ] │ +└──────────────────────────────────────────────────────────┘ +``` + +## Resolution Strategies + +### 1. Keep One Version +Create a new operation that: +- Undoes the unwanted changes (marks those cells as deleted) +- Re-applies one actor's version as a new operation +- Preserves causality (new operation depends on both branches) + +### 2. Keep Both as Separate Grids +- Fork the grid into two separate grids +- Each preserves one actor's version +- Allows manual copying between grids later + +### 3. Manual Resolution +- UI allows editing the merged (invalid) state +- Create operations to fix duration mismatches +- New operations causally depend on all previous ops + +## Key Design Properties + +1. **No Silent Data Loss** - Both operations always apply; conflicts are explicit +2. **Automatic Safe Merges** - Non-overlapping concurrent edits merge cleanly +3. **Deterministic Convergence** - All replicas converge to same state (including conflicts) +4. **User Control** - Ambiguous cases are presented to users, not arbitrarily resolved +5. **Causal Consistency** - Vector clocks ensure we can reconstruct any actor's view + +## Implementation Notes + +- Store all operations (even if they create conflicts) +- Conflicts are part of the realized state, not errors +- Resolution creates new operations (not undoing history) +- UI must handle temporary invalid states gracefully +- Consider conflict "severity" (overlapping subdivisions vs. minor duration drift) + +## Future Enhancements + +- **Operational Transformation** - Automatically adjust positions in some cases +- **Three-way Merge** - Show common ancestor state in conflict UI +- **Conflict Clustering** - Group related conflicts (e.g., same row) +- **Suggested Resolutions** - AI/heuristic-based merge suggestions + +## References + +- See `crdt-design.md` for overall CRDT architecture +- See `fractional-indexing.md` for position semantics diff --git a/docs/design/crdt-design.md b/docs/design/crdt-design.md new file mode 100644 index 0000000..5e6dd2d --- /dev/null +++ b/docs/design/crdt-design.md @@ -0,0 +1,270 @@ +# CRDT Design for Music Notation Grid + +## Overview + +This document captures the design considerations for implementing a CRDT-based state management system for a music notation grid that supports rhythmic subdivisions (e.g., triplets, quintuplets). + +**Last Updated:** 2025-11-12 + +## Domain Context + +- **Grid Structure:** Rows of cells representing rhythmic divisions +- **Base Unit:** Cells represent musical durations (e.g., quarter notes) +- **Subdivisions:** Replacing N cells with M cells of different duration (e.g., 4 eighth notes → 3 eighth note triplets) +- **Critical Constraint:** Total rhythmic duration of each row must remain constant (width preservation) + +## Why Not Use Automerge/JSON CRDT? + +### Problems with Generic JSON CRDTs + +1. **Invariant Enforcement:** Automerge doesn't know about width constraints + - Concurrent edits might violate total duration + - No way to "reject" a CRDT merge after the fact + - Can't undo without breaking causality + +2. **Wrong Abstraction Level:** + - Automerge provides list/map primitives + - We need rhythmic subdivision semantics + - Like using text CRDT for spreadsheet formulas + +3. **Validation After Merge:** + ```typescript + const doc = Automerge.merge(docA, docB); + if (!isValidWidth(doc.grid.rows[0])) { + // Now what? Can't reject, can't undo easily + } + ``` + +## Recommended Approach: Operation-Based CRDT + +### Core Principles + +1. **Cell IDs instead of indices** - stable references across replicas +2. **Subdivide = delete + insert** - clear semantics that preserve duration +3. **Conflict detection** - width violations shown to users, not silently resolved +4. **Operation-level commutativity** - same result regardless of operation order + +### Type Definitions + +```typescript +export type CellId = string; // UUID + +export type Cell = Immutable<{ + id: CellId; + + // Fractional index for positioning (like Figma, Linear) + position: string; // e.g., "a0", "a0.5", "a1" + + // Rhythmic duration this cell represents (e.g., 0.25 = quarter note) + duration: number; + + // Reference to parent if this is a subdivision + parentCellId: CellId | null; + + // CRDT metadata + createdBy: string; // OpId + timestamp: HLCTimestamp; + deleted: boolean; // Tombstone for deletions +}>; + +export type CreateGrid = Immutable<{ + type: "createGrid"; + opId: string; + timestamp: HLCTimestamp; + gridId: string; + rows: number; + baseCellsPerRow: number; + + // Pre-generated cell IDs for initial cells + initialCellIds: CellId[][]; // [row][col] +}>; + +export type Subdivide = Immutable<{ + type: "subdivide"; + opId: string; + timestamp: HLCTimestamp; + gridId: string; + rowIndex: number; + + // Cells to mark as deleted (tombstone) + deletedCellIds: CellId[]; + + // New cells to insert + newCells: Array<{ + id: CellId; + position: string; // Fractional index between neighbors + duration: number; // Must sum to deleted cells' total duration + }>; +}>; +``` + +### Key Design Decisions + +#### 1. Fractional Positioning + +Uses fractional indexing (similar to Figma's approach) to allow insertion between any two cells without reindexing. + +```typescript +// Example positions: +"a0" // First cell +"a0.5" // Inserted between first and second +"a1" // Second cell +"a1.25" // Inserted between second and third +``` + +#### 2. Tombstones for Deletions + +Cells are never truly deleted, just marked with `deleted: true`. This ensures: +- Idempotent operations (deleting twice is safe) +- Causal consistency (can tell what existed when) +- Conflict detection (know if concurrent ops targeted same cells) + +#### 3. Duration Preservation by Construction + +Each `Subdivide` operation ensures: +```typescript +const deletedDuration = deletedCellIds.reduce( + (sum, id) => sum + getCellById(id).duration, + 0 +); + +const insertedDuration = newCells.reduce( + (sum, cell) => sum + cell.duration, + 0 +); + +// Must be equal (enforced when creating operation) +assert(deletedDuration === insertedDuration); +``` + +## Conflict Resolution Options + +### Option 1: Last-Writer-Wins Per Cell + +```typescript +function applySubdivide(cells: Cell[], op: Subdivide): Cell[] { + const targetCellsExist = op.deletedCellIds.every(id => + cells.find(c => c.id === id && !c.deleted) + ); + + if (!targetCellsExist) { + // Target cells already deleted by concurrent op + // This op becomes no-op (idempotent) + return cells; + } + + // Apply deletion + insertion + // ... +} +``` + +**Pros:** +- Simple, deterministic +- Guaranteed commutative (same result regardless of order) + +**Cons:** +- Silently drops one user's edit +- No way to recover lost intent + +### Option 2: Hierarchical Cell Model + +Subdivisions create nested structure: + +```typescript +export type Cell = Immutable<{ + id: CellId; + duration: number; + + // If subdivided, contains child cells + children: Cell[] | null; + + // For CRDT: track modifications + createdBy: string; + subdivideOps: Set<string>; +}>; +``` + +**Pros:** +- Preserves both users' intent +- Can show both subdivisions in UI + +**Cons:** +- Complex merge logic +- Unclear what "subdivide an already-subdivided cell" means +- May require user resolution anyway + +### Option 3: Conflict Detection + User Resolution (RECOMMENDED) + +Allow concurrent operations to apply, but detect conflicts: + +```typescript +function validateRowWidth(row: Row): ValidationResult { + const activeCells = row.cells.filter(c => !c.deleted); + const totalDuration = activeCells.reduce((sum, c) => sum + c.duration, 0); + const expected = getExpectedDuration(row); + + if (totalDuration !== expected) { + return { + valid: false, + conflictingOps: findConflictingOps(row), + affectedCells: findAffectedCells(row), + suggestedResolution: { + // Provide options to user + } + }; + } + + return { valid: true }; +} +``` + +**Pros:** +- No silent data loss +- Users understand what happened +- Can choose best resolution per case + +**Cons:** +- Requires UI for conflict resolution +- Temporary invalid states exist +- More complex UX + +## Open Questions + +1. **Which conflict resolution method to use?** + - Pure LWW? + - Conflict detection + user resolution? + - Hybrid approach? + +2. **Fractional indexing implementation:** + - Use existing library or custom? + - How to handle position exhaustion? + +3. **Operation validation:** + - Client-side only or server-side too? + - How to handle invalid operations from old clients? + +4. **Snapshot strategy:** + - When to create snapshots for performance? + - How to compact operation history? + +## Next Steps + +- [ ] Decide on conflict resolution strategy +- [ ] Implement basic Cell ID system +- [ ] Replace index-based operations with ID-based +- [ ] Add operation IDs and timestamps (HLC) +- [ ] Implement fractional positioning +- [ ] Test commutativity with randomized operation order +- [ ] Design conflict UI (if using Option 3) + +## References + +- [Automerge Documentation](https://automerge.org/) +- [Yjs CRDT](https://docs.yjs.dev/) +- [Figma's Fractional Indexing](https://www.figma.com/blog/realtime-editing-of-ordered-sequences/) +- [Hybrid Logical Clocks Paper](https://jaredforsyth.com/posts/hybrid-logical-clocks/) + +## Related Code + +- Current implementation: `src/doc/index.ts` +- Grid realization logic: `src/doc/index.ts:87-125` diff --git a/docs/design/fractional-indexing.md b/docs/design/fractional-indexing.md new file mode 100644 index 0000000..287e719 --- /dev/null +++ b/docs/design/fractional-indexing.md @@ -0,0 +1,396 @@ +# Fractional Indexing for Stable Cell Positions + +**Last Updated:** 2025-11-13 + +## Overview + +Fractional indexing provides stable, immutable identifiers for cell positions in an ordered sequence, enabling concurrent insertions without the coordination problems of array indices. + +## The Problem with Array Indices + +Consider this scenario with traditional array indices: + +``` +Initial state: +[0: A, 1: B, 2: C, 3: D] + +Alice subdivides cells 1-2 (B, C) +Bob subdivides cells 2-3 (C, D) + +After Alice's operation: +[0: A, 1: T1, 2: T2, 3: T3, 4: D] +Now "cell 2" means T2 + +After Bob's operation (he thinks "cell 2" is C): +[0: A, 1: T1, 2: T2, 3: T3, 4: Q1, 5: Q2, 6: Q3, 7: D] +But C is gone! Bob's operation is invalid. +``` + +**The problem:** Indices are relative positions that change when the array is modified. Concurrent operations reference different cells by the same index. + +## Fractional Indexing Solution + +Instead of indices, each cell has a **position string** that never changes: + +``` +Initial state: +Cell A: position="a0" +Cell B: position="a1" +Cell C: position="a2" +Cell D: position="a3" + +Alice subdivides [B, C] (position="a1" and "a2") +Bob subdivides [C, D] (position="a2" and "a3") + +Alice creates: + T1: position="a1.2" (between a1 and a2) + T2: position="a1.5" + T3: position="a1.8" + +Bob creates: + Q1: position="a2.2" (between a2 and a3) + Q2: position="a2.5" + Q3: position="a2.8" + +Merged result (sorted by position): +a0 (A), a1 (B, deleted), a1.2 (T1), a1.5 (T2), a1.8 (T3), +a2 (C, deleted), a2.2 (Q1), a2.5 (Q2), a2.8 (Q3), a3 (D, deleted) +``` + +**Key insight:** Position strings are immutable. Operations reference cells by their stable position, not their current index in the array. + +## Position String Format + +Positions are lexicographically sortable strings: + +``` +"a0" < "a0.5" < "a1" < "a1.25" < "a2" +``` + +### Base Structure + +``` +[prefix][integer].[fractional] +``` + +- **Prefix:** Letter(s) for global ordering (e.g., "a", "b", "z") +- **Integer:** Whole number part +- **Fractional:** Optional decimal part for insertions + +### Generating Positions + +```rust +pub struct FractionalIndex(String); + +impl FractionalIndex { + /// Create initial positions for a sequence + pub fn initial_sequence(count: usize) -> Vec<Self> { + (0..count) + .map(|i| FractionalIndex(format!("a{}", i))) + .collect() + } + + /// Generate position between two existing positions + pub fn between(before: &str, after: &str) -> Self { + // Simplified version - production needs more robust logic + let before_val = parse_position(before); + let after_val = parse_position(after); + let middle = (before_val + after_val) / 2.0; + + FractionalIndex(format_position(middle)) + } + + /// Generate position before the first element + pub fn before_first(first: &str) -> Self { + // e.g., "a0" → "a-1" + let val = parse_position(first); + FractionalIndex(format_position(val - 1.0)) + } + + /// Generate position after the last element + pub fn after_last(last: &str) -> Self { + // e.g., "a5" → "a6" + let val = parse_position(last); + FractionalIndex(format_position(val + 1.0)) + } +} + +fn parse_position(pos: &str) -> f64 { + // Strip prefix, parse number + let num_part = pos.trim_start_matches(|c: char| c.is_alphabetic()); + num_part.parse().unwrap_or(0.0) +} + +fn format_position(val: f64) -> String { + if val.fract() == 0.0 { + format!("a{}", val as i64) + } else { + format!("a{}", val) + } +} +``` + +### Insertion Examples + +```rust +// Insert between a0 and a1 +FractionalIndex::between("a0", "a1") // → "a0.5" + +// Insert between a0.5 and a1 +FractionalIndex::between("a0.5", "a1") // → "a0.75" + +// Insert between a0.75 and a1 +FractionalIndex::between("a0.75", "a1") // → "a0.875" + +// Insert at beginning +FractionalIndex::before_first("a0") // → "a-1" + +// Insert at end +FractionalIndex::after_last("a3") // → "a4" +``` + +## Subdivision Operation with Positions + +```rust +pub struct Subdivide { + pub deleted_cell_ids: Vec<Uuid>, + pub new_cells: Vec<NewCell>, +} + +pub struct NewCell { + pub id: Uuid, + pub position: String, // Fractional position + pub duration: f64, +} + +impl Subdivide { + /// Create a subdivide operation + pub fn new( + target_cells: &[Cell], + new_durations: Vec<f64>, + ) -> Self { + // Find position range for new cells + let first_pos = &target_cells.first().unwrap().position; + let last_pos = &target_cells.last().unwrap().position; + + // Generate evenly spaced positions between first and last + let new_cells = distribute_positions( + first_pos, + last_pos, + new_durations.len(), + ) + .into_iter() + .zip(new_durations) + .map(|(position, duration)| NewCell { + id: Uuid::new_v7(), + position, + duration, + }) + .collect(); + + Subdivide { + deleted_cell_ids: target_cells.iter().map(|c| c.id).collect(), + new_cells, + } + } +} + +fn distribute_positions( + first: &str, + last: &str, + count: usize, +) -> Vec<String> { + let start = parse_position(first); + let end = parse_position(last); + let step = (end - start) / (count as f64 + 1.0); + + (1..=count) + .map(|i| format_position(start + step * i as f64)) + .collect() +} +``` + +**Example:** +```rust +// Subdivide cells at positions a1, a2, a3 into 5 new cells + +// Position range: a1 to a3 +// Generate 5 positions between them: +// a1.33, a1.67, a2.0, a2.33, a2.67 + +let subdivide = Subdivide { + deleted_cell_ids: [id_a1, id_a2, id_a3], + new_cells: [ + NewCell { id: new_id_1, position: "a1.33", duration: 0.15 }, + NewCell { id: new_id_2, position: "a1.67", duration: 0.15 }, + NewCell { id: new_id_3, position: "a2.0", duration: 0.15 }, + NewCell { id: new_id_4, position: "a2.33", duration: 0.15 }, + NewCell { id: new_id_5, position: "a2.67", duration: 0.15 }, + ], +}; +``` + +## Applying Operations + +```rust +impl Op { + fn apply(&self, realized: &mut RealizedDoc) { + match &self.payload { + OpPayload::Subdivide { new_cells, deleted_cell_ids, .. } => { + let row = realized.get_row_mut(self.row_id); + + // Mark deleted cells (tombstone) + for cell_id in deleted_cell_ids { + if let Some(cell) = row.cells.iter_mut().find(|c| c.id == *cell_id) { + cell.deleted = true; + } + } + + // Insert new cells + for new_cell in new_cells { + row.cells.push(Cell { + id: new_cell.id, + position: new_cell.position.clone(), + duration: new_cell.duration, + created_by: self.id, + deleted: false, + }); + } + + // Sort by position for display + row.cells.sort_by(|a, b| a.position.cmp(&b.position)); + } + _ => {} + } + } +} +``` + +## Handling Concurrent Insertions + +Fractional indexing ensures deterministic ordering even with concurrent insertions: + +``` +Initial: [a0: A, a1: B, a2: C] + +Alice inserts between A and B: position = "a0.5" +Bob inserts between A and B: position = "a0.5" + +Both generate the same position! +``` + +**Solution 1: Add randomness** +```rust +fn between_with_jitter(before: &str, after: &str, actor_id: &Uuid) -> String { + let base = between(before, after); + // Add tiny actor-specific offset + let offset = (actor_id.as_bytes()[0] as f64) / 1000000.0; + adjust_position(&base, offset) +} +``` + +**Solution 2: Use timestamp in position** +```rust +fn between_with_timestamp(before: &str, after: &str, timestamp: u64) -> String { + let middle = between(before, after); + format!("{}.{}", middle, timestamp % 1000) +} +``` + +**Solution 3: Accept ties (rely on cell ID for final ordering)** +```rust +impl Cell { + fn compare_for_display(&self, other: &Cell) -> Ordering { + // Primary: position + match self.position.cmp(&other.position) { + Ordering::Equal => { + // Tie-breaker: cell ID (lexicographic) + self.id.cmp(&other.id) + } + other => other, + } + } +} +``` + +## Position Space Exhaustion + +**Problem:** Repeatedly inserting between two positions eventually runs out of precision: + +``` +a0, a0.5, a0.75, a0.875, a0.9375, a0.96875, ... +``` + +**Solutions:** + +### 1. Rebalancing (Offline) +Periodically regenerate positions with even spacing: +```rust +fn rebalance_positions(cells: &mut [Cell]) { + let count = cells.len(); + for (i, cell) in cells.iter_mut().enumerate() { + cell.position = format!("a{}", i); + } +} +``` + +### 2. Multi-level Positions +Use hierarchical structure: +``` +"a0", "a0|b0", "a0|b1", "a1" +``` + +### 3. Arbitrary Precision +Use strings that can grow indefinitely: +``` +"a0", "a0.5", "a0.5.5", "a0.5.5.5", ... +``` + +## Production Implementation + +Consider using existing libraries: + +- **fractional-index** (Rust) - Figma-style implementation +- **string-ordering** - Lexicographic ordering primitives + +Example with `fractional-index`: +```rust +use fractional_index::{FractionalIndex, generate_key_between}; + +// Generate position between two cells +let pos = generate_key_between( + Some(&cell_a.position), + Some(&cell_b.position), +); +``` + +## Key Properties + +1. **Immutable** - Positions never change after creation +2. **Lexicographic** - String comparison gives correct order +3. **Dense** - Can always insert between any two positions +4. **Deterministic** - Same inputs produce same outputs +5. **Conflict-Free** - Concurrent insertions at same position are handled gracefully + +## Trade-offs + +**Pros:** +- No coordination needed for insertions +- Stable references across replicas +- Simple to reason about + +**Cons:** +- Positions can become long strings over time +- Requires periodic rebalancing in heavy-edit scenarios +- Slightly more complex than array indices + +## References + +- [Figma's Blog: Realtime Editing of Ordered Sequences](https://www.figma.com/blog/realtime-editing-of-ordered-sequences/) +- [fractional-index crate](https://crates.io/crates/fractional-index) +- [LSEQ CRDT](https://hal.archives-ouvertes.fr/hal-00921633/document) + +## Related Documents + +- See `crdt-design.md` for overall CRDT architecture +- See `branch-based-merge.md` for conflict detection and resolution |
