summaryrefslogtreecommitdiff
path: root/docs/design/crdt-design.md
blob: 5e6dd2d6d9e4094425f89be3cdbd2a08a01d8a9b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
# CRDT Design for Music Notation Grid

## Overview

This document captures the design considerations for implementing a CRDT-based state management system for a music notation grid that supports rhythmic subdivisions (e.g., triplets, quintuplets).

**Last Updated:** 2025-11-12

## Domain Context

- **Grid Structure:** Rows of cells representing rhythmic divisions
- **Base Unit:** Cells represent musical durations (e.g., quarter notes)
- **Subdivisions:** Replacing N cells with M cells of different duration (e.g., 4 eighth notes → 3 eighth note triplets)
- **Critical Constraint:** Total rhythmic duration of each row must remain constant (width preservation)

## Why Not Use Automerge/JSON CRDT?

### Problems with Generic JSON CRDTs

1. **Invariant Enforcement:** Automerge doesn't know about width constraints
   - Concurrent edits might violate total duration
   - No way to "reject" a CRDT merge after the fact
   - Can't undo without breaking causality

2. **Wrong Abstraction Level:**
   - Automerge provides list/map primitives
   - We need rhythmic subdivision semantics
   - Like using text CRDT for spreadsheet formulas

3. **Validation After Merge:**
   ```typescript
   const doc = Automerge.merge(docA, docB);
   if (!isValidWidth(doc.grid.rows[0])) {
     // Now what? Can't reject, can't undo easily
   }
   ```

## Recommended Approach: Operation-Based CRDT

### Core Principles

1. **Cell IDs instead of indices** - stable references across replicas
2. **Subdivide = delete + insert** - clear semantics that preserve duration
3. **Conflict detection** - width violations shown to users, not silently resolved
4. **Operation-level commutativity** - same result regardless of operation order

### Type Definitions

```typescript
export type CellId = string; // UUID

export type Cell = Immutable<{
  id: CellId;

  // Fractional index for positioning (like Figma, Linear)
  position: string;  // e.g., "a0", "a0.5", "a1"

  // Rhythmic duration this cell represents (e.g., 0.25 = quarter note)
  duration: number;

  // Reference to parent if this is a subdivision
  parentCellId: CellId | null;

  // CRDT metadata
  createdBy: string;  // OpId
  timestamp: HLCTimestamp;
  deleted: boolean;  // Tombstone for deletions
}>;

export type CreateGrid = Immutable<{
  type: "createGrid";
  opId: string;
  timestamp: HLCTimestamp;
  gridId: string;
  rows: number;
  baseCellsPerRow: number;

  // Pre-generated cell IDs for initial cells
  initialCellIds: CellId[][];  // [row][col]
}>;

export type Subdivide = Immutable<{
  type: "subdivide";
  opId: string;
  timestamp: HLCTimestamp;
  gridId: string;
  rowIndex: number;

  // Cells to mark as deleted (tombstone)
  deletedCellIds: CellId[];

  // New cells to insert
  newCells: Array<{
    id: CellId;
    position: string;  // Fractional index between neighbors
    duration: number;  // Must sum to deleted cells' total duration
  }>;
}>;
```

### Key Design Decisions

#### 1. Fractional Positioning

Uses fractional indexing (similar to Figma's approach) to allow insertion between any two cells without reindexing.

```typescript
// Example positions:
"a0"     // First cell
"a0.5"   // Inserted between first and second
"a1"     // Second cell
"a1.25"  // Inserted between second and third
```

#### 2. Tombstones for Deletions

Cells are never truly deleted, just marked with `deleted: true`. This ensures:
- Idempotent operations (deleting twice is safe)
- Causal consistency (can tell what existed when)
- Conflict detection (know if concurrent ops targeted same cells)

#### 3. Duration Preservation by Construction

Each `Subdivide` operation ensures:
```typescript
const deletedDuration = deletedCellIds.reduce(
  (sum, id) => sum + getCellById(id).duration,
  0
);

const insertedDuration = newCells.reduce(
  (sum, cell) => sum + cell.duration,
  0
);

// Must be equal (enforced when creating operation)
assert(deletedDuration === insertedDuration);
```

## Conflict Resolution Options

### Option 1: Last-Writer-Wins Per Cell

```typescript
function applySubdivide(cells: Cell[], op: Subdivide): Cell[] {
  const targetCellsExist = op.deletedCellIds.every(id =>
    cells.find(c => c.id === id && !c.deleted)
  );

  if (!targetCellsExist) {
    // Target cells already deleted by concurrent op
    // This op becomes no-op (idempotent)
    return cells;
  }

  // Apply deletion + insertion
  // ...
}
```

**Pros:**
- Simple, deterministic
- Guaranteed commutative (same result regardless of order)

**Cons:**
- Silently drops one user's edit
- No way to recover lost intent

### Option 2: Hierarchical Cell Model

Subdivisions create nested structure:

```typescript
export type Cell = Immutable<{
  id: CellId;
  duration: number;

  // If subdivided, contains child cells
  children: Cell[] | null;

  // For CRDT: track modifications
  createdBy: string;
  subdivideOps: Set<string>;
}>;
```

**Pros:**
- Preserves both users' intent
- Can show both subdivisions in UI

**Cons:**
- Complex merge logic
- Unclear what "subdivide an already-subdivided cell" means
- May require user resolution anyway

### Option 3: Conflict Detection + User Resolution (RECOMMENDED)

Allow concurrent operations to apply, but detect conflicts:

```typescript
function validateRowWidth(row: Row): ValidationResult {
  const activeCells = row.cells.filter(c => !c.deleted);
  const totalDuration = activeCells.reduce((sum, c) => sum + c.duration, 0);
  const expected = getExpectedDuration(row);

  if (totalDuration !== expected) {
    return {
      valid: false,
      conflictingOps: findConflictingOps(row),
      affectedCells: findAffectedCells(row),
      suggestedResolution: {
        // Provide options to user
      }
    };
  }

  return { valid: true };
}
```

**Pros:**
- No silent data loss
- Users understand what happened
- Can choose best resolution per case

**Cons:**
- Requires UI for conflict resolution
- Temporary invalid states exist
- More complex UX

## Open Questions

1. **Which conflict resolution method to use?**
   - Pure LWW?
   - Conflict detection + user resolution?
   - Hybrid approach?

2. **Fractional indexing implementation:**
   - Use existing library or custom?
   - How to handle position exhaustion?

3. **Operation validation:**
   - Client-side only or server-side too?
   - How to handle invalid operations from old clients?

4. **Snapshot strategy:**
   - When to create snapshots for performance?
   - How to compact operation history?

## Next Steps

- [ ] Decide on conflict resolution strategy
- [ ] Implement basic Cell ID system
- [ ] Replace index-based operations with ID-based
- [ ] Add operation IDs and timestamps (HLC)
- [ ] Implement fractional positioning
- [ ] Test commutativity with randomized operation order
- [ ] Design conflict UI (if using Option 3)

## References

- [Automerge Documentation](https://automerge.org/)
- [Yjs CRDT](https://docs.yjs.dev/)
- [Figma's Fractional Indexing](https://www.figma.com/blog/realtime-editing-of-ordered-sequences/)
- [Hybrid Logical Clocks Paper](https://jaredforsyth.com/posts/hybrid-logical-clocks/)

## Related Code

- Current implementation: `src/doc/index.ts`
- Grid realization logic: `src/doc/index.ts:87-125`