Local‑First Apps in 2025: CRDTs, Replication Patterns, and Edge Storage for Real‑Time Offline Sync
Local‑first is no longer niche. In 2025, teams building collaborative products expect sub‑100ms interactions, always‑on editing, and the ability to reopen a laptop on a plane and keep working. Achieving that is not about yet another web socket server. It is about moving the system of record close to the user, designing for offline as the default mode, and proving that state converges under wildly asynchronous patterns.
This article offers a practical blueprint. It will help you decide CRDTs vs OT, design IDs and conflict rules, run sync at the edge using Durable Objects and Turso SQLite, secure multi‑tenant data, test merges, and steer around real‑world pitfalls. It is opinionated where reality demands it, and pragmatic where your roadmap does.
TL;DR
- Use CRDTs for multiwriter offline‑first systems; reserve OT for centralized, latency‑sensitive rich text where you still control a linearizing server.
- Assign stable actor IDs, use ULIDs or random 128‑bit IDs for entities, and encode clear conflict policies per field.
- Run a per‑document or per‑space sequencer at the edge (for example, Cloudflare Durable Objects) and persist ops in edge‑close SQLite (for example, Turso). Keep client state in SQLite in the browser via OPFS or IndexedDB‑backed engines.
- Encrypt per space; use capability tokens with caveats and row‑level filtering at the application layer. Do not rely on SQLite for RLS.
- Test merges with property‑based tests, op‑fuzzers, and differential replay. Maintain an ops ledger for reproducibility.
- Plan for snapshots, GC, index rebuilds, and document migration. Monitor CRDT size and run compression.
Why local‑first in 2025
Local‑first delivers three hard outcomes simultaneously:
- Perceived instant operations: reads and writes hit local storage and render immediately.
- Offline by default: a metro outage should not corrupt data or block productivity.
- Trustworthy convergence: after partitions heal, replicas reach the same state deterministically.
The tooling finally caught up. SQLite in the browser is mature via WebAssembly and the Origin Private File System (OPFS), mobile has reliable background sync, edge compute is ubiquitous, and practical CRDT libraries exist for lists, maps, counters, and registers. You can keep the fast path local while still projecting a global, consistent view.
Architecture: the shape of a local‑first system
A robust local‑first app is a set of cooperating replicas:
-
Client replica
- Durable local database (SQLite via OPFS, IndexedDB, or platform SQLite).
- Conflict‑free data types for shared state.
- Background transport that batches and synchronizes ops.
-
Edge replica
- Per‑document sequencer to impose causal order for side effects and snapshots.
- Append‑only ops ledger in an edge database (Turso libSQL or SQLite shard per document or tenant).
- Subscription fanout to clients, backpressure, and storage compaction.
-
Optional regional core or long‑term storage
- Durable archive of snapshots and compaction products.
- Analytics, search indexes, backups, and key escrow.
This is a star topology with a spine of edge nodes. It is not pure peer‑to‑peer, because you want linearization for side effects, access control enforcement, and an audit trail. But it is not centralized either, because clients own their data and proceed without the edge.
CRDTs vs OT: choosing the right concurrency control
You do not need to join a decade‑long debate to make a good decision. Use this rule of thumb:
-
Choose CRDTs when
- Clients must make progress offline with no central arbiter.
- Multiple concurrent writers are common.
- You need convergence guarantees under any message ordering and duplication.
- You can tolerate minimal metadata overhead per element.
-
Choose OT when
- You have a central, online server that can rebase and linearize operations in real time.
- You are optimizing specifically for rich‑text keystroke performance with a small number of concurrent writers.
- You accept more complex server logic and weaker offline guarantees.
CRDTs now come in flavors that are practical for product work:
- State‑based CRDTs (CvRDT) that merge via a join in a semilattice; easy but large payloads.
- Operation‑based CRDTs (CmRDT) with causal delivery; efficient but require tracking causality.
- Delta‑state CRDTs that ship compact deltas; a pragmatic middle ground.
For lists (documents, rich text), mature options include Yjs, Automerge 2, and Logoot‑style CRDTs. For maps, sets, counters, and registers, you can use built‑ins like OR‑Map, OR‑Set, PN‑Counter, and multi‑value register (MVReg). Many teams run a hybrid: CRDTs for document structure and metadata, OT for keystroke‑level text editing within a paragraph. That compromise can produce excellent UX while retaining offline correctness.
Opinion: in 2025, default to CRDTs unless you have a hard requirement that only OT satisfies. You want offline and multiwriter correctness baked in.
Data modeling and IDs: make conflicts boring
Correct convergence starts with IDs and conflict semantics. Some rules that consistently save teams pain:
- Use 128‑bit random IDs or ULIDs for entities and edges. ULIDs sort by time for better locality but do not rely on wall clocks for correctness.
- Assign stable actor IDs per device install. Back them with an asymmetric keypair (Ed25519) to sign ops and provide a durable identity. This also simplifies capability tokens.
- Use causal metadata per op: actor ID plus a monotonically increasing counter, or a version vector per document. Avoid relying on timestamps to break ties.
- Express field‑level conflict rules explicitly:
- Registers: MVReg or last‑writer‑wins (LWW) with a tiebreaker on actor ID and counter.
- Sets: grow‑only (G‑Set) when deletion is not needed, OR‑Set when it is.
- Counters: PN‑Counter for increments and decrements.
- Maps: OR‑Map to combine nested CRDTs.
- Lists: use a list CRDT such as RGA, LSEQ, LogootSplit, or the list in Yjs and Automerge.
Design for idempotency from the start. Treat every op as safe to apply twice. That means the op must carry the identity of the causally unique event, for example actor ID plus counter.
Example: operation envelope
Every op your client emits should be wrapped in a minimal envelope that carries convergence metadata. Keep it boring and explicit.
json{ 'docId': 'space_01HX1K9Q5X8S7W4G7D3YH2J0VZ', 'actorId': 'actor_f2c0a9b4e7c1', 'seq': 1023, 'causalParents': ['actor_f2c0a9b4e7c1:1022', 'actor_64bd...:88'], 'type': 'setField', 'path': ['tasks', 'task_01H...', 'title'], 'value': 'Ship local‑first blog', 'timestamp': 1735584000000, // informational only 'signature': 'ed25519:...' // optional but recommended }
Note: the JSON above uses single quotes for readability here; use strict JSON in production. The timestamp is not used for correctness, only for UX and audit.
Replication and sync patterns
The network will deliver messages late, early, duplicated, and out of order. Good replication protocols make that boring.
-
Transport agnostic ops
- Encode ops in CBOR or MessagePack with actor, seq, parents, and payload.
- Use content‑addressing (hash of op) as an optional layer for deduplication.
-
Push and pull
- Push new ops to the edge over WebSocket or WebTransport; receive acks by watermark.
- Periodically pull missing ops based on version vectors or bloom filters on op IDs.
-
Idempotency and reordering
- The server applies ops out of order into a CRDT, not a transactional row mutate. Store in an append‑only ledger and materialize views from it.
-
Compression and snapshots
- Periodically compute a snapshot state and reset causal metadata to reduce memory. Keep a stable snapshot interval per document (for example every 10k ops or 10 MB).
- Compact historical ops by folding sequences that commute (for example OR‑Set adds and removes that cancel out).
-
Backpressure
- If clients fall behind beyond a threshold, fall back to snapshot download instead of replaying a million ops.
-
Partial replication
- Large documents or spaces should be chunked by subtree path, shard, or tile so clients subscribe only to what they view.
Running sync at the edge
You want an edge process that understands your doc model, enforces access, sequences side effects, and persists ops with low latency for nearby users. Two battle‑tested components in 2025 are Cloudflare Durable Objects and Turso (libSQL‑backed SQLite).
Durable Objects as a convergence coordinator
Durable Objects (DO) provide single‑threaded, per‑key state with strong ordering. Map one DO to one collaborative space or document. It becomes the place where:
- You validate and accept ops.
- You impose a causally consistent order for side effects and snapshot production.
- You fan out deltas to currently connected clients.
The DO does not own the canonical state; your system is still multi‑master because clients can proceed offline. The DO simply sequences and persists ops near the users and helps avoid duplication or head‑of‑line blocking in the rest of your infrastructure.
Example sketch of a Durable Object that persists to Turso and fans out ops:
tsexport class DocObject { state: DurableObjectState storage: DurableObjectStorage subscribers: Map<string, WebSocket> turso: any constructor(state: DurableObjectState, env: any) { this.state = state this.storage = state.storage this.subscribers = new Map() this.turso = createTursoClient(env.TURSO_URL, env.TURSO_TOKEN) } async fetch(req: Request) { const url = new URL(req.url) if (url.pathname === '/ws' && req.headers.get('Upgrade') === 'websocket') { const [client, server] = Object.values(new WebSocketPair()) as [WebSocket, WebSocket] await this.handleSocket(server) return new Response(null, { status: 101, webSocket: client }) } if (req.method === 'POST' && url.pathname === '/op') { const op = await req.json() await this.applyOp(op) return new Response('ok') } return new Response('not found', { status: 404 }) } async handleSocket(ws: WebSocket) { const id = crypto.randomUUID() this.subscribers.set(id, ws) ws.accept() ws.addEventListener('message', async evt => { const msg = JSON.parse(evt.data as string) if (msg.type === 'op') await this.applyOp(msg.op) if (msg.type === 'pull') await this.sendSince(msg.after) }) ws.addEventListener('close', () => this.subscribers.delete(id)) } async applyOp(op: any) { // 1) authz, signature verify, capability check // 2) idempotency: ignore if already stored await this.turso.execute({ sql: 'insert or ignore into ops (doc_id, op_id, actor, seq, parents, payload, ts) values (?, ?, ?, ?, ?, ?, ?)', args: [op.docId, op.opId, op.actorId, op.seq, JSON.stringify(op.causalParents), JSON.stringify(op.payload), op.timestamp] }) // 3) broadcast to subscribers const msg = JSON.stringify({ type: 'op', op }) for (const ws of this.subscribers.values()) ws.send(msg) } async sendSince(watermark: string) { const rows = await this.turso.execute({ sql: 'select payload from ops where doc_id = ? and op_id > ? order by rowid asc limit 1000', args: [/* doc id */, watermark] }) // send batched // ... } }
Notes:
- DO provides per‑document ordering for side effects, not correctness of the CRDT. The CRDT converges even without the DO.
- Use limits and backpressure. If a client asks for too much, send a snapshot instead.
- The schema and SQL are illustrative; adjust to your op envelope.
Turso SQLite for edge‑close ops storage
SQLite is an excellent ops ledger, and libSQL via Turso gives you global edge placement with a simple client, low write latency, and read replicas. Schemas that work well for CRDT ops have three layers:
- ops: append‑only, idempotent on op_id, partitioned by doc_id
- snapshots: periodic compressed state with the causal watermark
- indexes or materialized views: derived application views for queries
Example minimal schema:
sqlcreate table if not exists ops ( doc_id text not null, op_id text primary key, actor text not null, seq integer not null, parents text not null, payload blob not null, ts integer not null ); create index if not exists idx_ops_doc on ops(doc_id, ts); create table if not exists snapshots ( doc_id text primary key, watermark text not null, snapshot blob not null, ts integer not null );
For multi‑tenant isolation, consider one database per tenant for strong blast‑radius control or at least hard partition keys in your tables with strict prefix queries. SQLite lacks built‑in row‑level security, so you must enforce tenant boundaries in your SQL access layer and through capability tokens.
Client storage: SQLite in the browser
In 2025 the stack looks like this on the client:
- Persistent store: SQLite compiled to WebAssembly backed by OPFS, or a robust IndexedDB engine.
- CRDT engine: Yjs or Automerge for lists and maps; simple counters and sets can be hand‑rolled.
- Transport: WebSocket or WebTransport for live sync; HTTP for initial snapshot fetches.
Persisting in SQLite locally allows you to:
- Write synchronously on the UI thread only when safe, otherwise defer to a worker.
- Keep an ops table for offline operations.
- Maintain materialized views to power your UI without replay.
A reference blueprint: putting it together
Let us sketch a complete flow for a team notes app with tasks and rich comments.
-
Data model
- Each space is a CRDT map keyed by IDs. Sections are list CRDTs. Tasks are OR‑Map with fields that are MVReg or LWW where appropriate.
- Attachments are content‑addressed blobs stored separately (R2, S3), referenced by hash in the doc. Blob metadata sits in the CRDT.
-
IDs
- Space IDs are ULIDs. Entity IDs are random 128‑bit. Actors are Ed25519 keys derived from device enrollment.
- Op ID is hash(actorId, seq, payload) to deduplicate.
-
Conflict rules
- Title field is MVReg with tiebreaking by (actor, seq), then lexicographic actor.
- Task status is LWW with semantic guards; if both sides flip from open to done concurrently, prefer done.
- Members set is OR‑Set.
-
Sync
- Client batches ops up to 32 KB or 100 ms, whichever comes first.
- Edge DO validates capability and merges op into ops table. Broadcasts to all subscribers.
- Every N ops, DO writes a snapshot. Clients older than watermark fetch the new snapshot.
-
Security
- Capability token is a macaroon‑like token with caveats: tenant, doc, role, expiry, signature bound to actor key.
- Data encryption per space using XChaCha20‑Poly1305. Space key is encrypted to each member's X25519 public key. Attachments use the same envelope pattern.
-
Testing
- Keep an ops ledger per space in Turso and also mirror logs in object storage.
- Nightly property‑based tests attempt thousands of random merges and reorders using recorded op distributions.
Code snippets: client CRDT and edge sync
A tiny sketch using Automerge for a tasks list with offline ops queue, storing locally in SQLite. This is intentionally minimal to show structure.
tsimport * as Automerge from '@automerge/automerge' import { open } from 'sqlite-wasm' interface Task { id: string; title: string; done: boolean } class LocalReplica { db: any doc: Automerge.Doc<any> opsQueue: any[] actorId: string constructor(actorId: string) { this.actorId = actorId this.opsQueue = [] } async init() { this.db = await open({ filename: 'app.db', create: true }) await this.db.exec('create table if not exists pending_ops (op blob, ts integer)') const row = await this.db.get('select snapshot from docs where id = ?', ['space_1']) this.doc = row ? Automerge.load(row.snapshot) : Automerge.from({ tasks: [] as Task[] }) } addTask(title: string) { const [nextDoc, change] = Automerge.change(this.doc, d => { d.tasks.push({ id: crypto.randomUUID(), title, done: false }) }) this.doc = nextDoc const op = { actorId: this.actorId, seq: this.nextSeq(), payload: change } this.enqueueOp(op) } applyRemote(change: Uint8Array) { this.doc = Automerge.applyChanges(this.doc, [change]) } enqueueOp(op: any) { this.opsQueue.push(op) this.db.run('insert into pending_ops (op, ts) values (?, ?)', [op, Date.now()]) } nextSeq() { /* monotonic per actor */ } }
On the edge, the DO accepts Automerge changes (binary) and stores them. The server does not interpret the document structure; it treats changes as opaque CRDT deltas. You can later compute snapshots by applying all changes in order.
Security and multi‑tenant data
Security is not just authentication. It is also preventing side channels, stopping cross‑tenant data leaks, and revoking access in a distributed setting.
-
Isolation model
- Hard option: per‑tenant database instances. Costlier but clean boundaries and backups. Turso supports many small databases; this is a strong default for enterprise.
- Soft option: shared database with tenant prefix keys and strict guards in code. Pair with continuous verification.
-
Capability tokens
- Favor capability tokens over broad bearer tokens. A capability binds a subject (actor key) to a resource (doc ID) with caveats (role, expiry, path constraints). Macaroons are one approach. Sign capabilities with a service key and validate at the edge. Optionally bind to a client public key so it cannot be replayed from another device.
-
End‑to‑end encryption
- Encrypt document payloads and ops. Maintain a per‑space symmetric key. When inviting a new member, wrap the key to their X25519 public key. For revocation, rotate the space key and rewrap to remaining members. Old content remains accessible unless you also re‑encrypt snapshots and blobs; plan your threat model accordingly.
-
Row‑level security
- SQLite does not offer RLS. Implement filters in your application layer and avoid ad‑hoc SQL. Consider generating SQL from a restricted DSL that bakes in tenant filters.
-
Audit and forward‑secure logs
- Sign ops with actor keys. Compute a per‑space hash chain of ops to detect tampering. Store the head in an append‑only log in object storage to get external anchoring.
-
Rate limits and fairness
- Rate‑limit per actor and per doc to prevent flooding the ledger with tiny ops. Aggregate keystrokes into bounded deltas.
Testing merges and correctness
Local‑first correctness is a property of your merge function under adversarial reorderings, not a unit test that checks a single sequence.
-
Property‑based testing
- Use libraries like fast‑check (TypeScript), Hypothesis (Python), or QuickCheck (Haskell) to generate random operation sequences, reorder them, and check invariants such as idempotency, commutativity when expected, and convergence.
-
Differential testing
- Run the same operation sequences through multiple engines, for example Yjs vs your own model for a subset of operations, and compare final states.
-
Replay production traces
- Persist ops in a ledger; nightly, replay them with randomized delivery, drops, and duplication. Confirm the snapshot hash remains invariant.
-
Fuzz causality
- Randomize causal parents, simulate partial causal histories, and confirm the engine buffers until parents are present or handles causality safely.
-
Invariants to encode
- No duplicate IDs in sets or lists after merge.
- List order contains exactly the elements with no loss.
- Application invariants such as unique task titles or constraints must be enforced as soft constraints with repair strategies; hard constraints are not compatible with arbitrary merges.
Example property‑based test using fast‑check for a simplified OR‑Set:
tsimport fc from 'fast-check' class ORSet<T> { adds = new Map<string, T>() removes = new Set<string>() add(id: string, v: T) { this.adds.set(id, v) } remove(id: string) { this.removes.add(id) } value(): T[] { return [...this.adds].filter(([id]) => !this.removes.has(id)).map(([,v]) => v) } merge(other: ORSet<T>): ORSet<T> { const out = new ORSet<T>() out.adds = new Map([...this.adds, ...other.adds]) out.removes = new Set([...this.removes, ...other.removes]) return out } } fc.assert( fc.property( fc.array(fc.oneof( fc.record({ t: fc.constant('add'), id: fc.uuid(), v: fc.string() }), fc.record({ t: fc.constant('rem'), id: fc.uuid() }) )), ops => { const a = new ORSet<string>() const b = new ORSet<string>() for (const op of ops) { if (op.t === 'add') a.add(op.id, op.v); else a.remove(op.id) } for (const op of ops.slice().reverse()) { if (op.t === 'add') b.add(op.id, op.v); else b.remove(op.id) } const v1 = a.merge(b).value().sort() const v2 = b.merge(a).value().sort() return JSON.stringify(v1) === JSON.stringify(v2) } ) )
This test checks commutativity of merge under reversed operation order. Expand with idempotency and associativity.
Performance engineering: GC, snapshots, and indexes
CRDTs can grow due to tombstones and metadata. Plan for lifecycle management from day one:
-
Snapshot strategy
- Keep an append‑only ops log per document but produce compact snapshots frequently. A rolling window of the last N ops keeps catch‑up cheap.
-
GC and vacuum
- Periodically fold tombstones and compact. Yjs and Automerge provide GC options; tune them. In SQLite, run vacuum during low traffic windows.
-
Op compression
- Combine adjacent changes that commute, like multiple increments of the same counter, before shipping them across the network.
-
Index rebuilds
- Materialized views should be rebuildable from snapshots and recent ops. Treat them as caches, not sources of truth.
-
Attachment handling
- Never embed large blobs in your CRDT. Store them in blob storage with content hashes and range requests. Make upload and download resumable. Keep small thumbnails locally in SQLite for instant UX.
-
Batch size and Nagle effects
- Avoid per‑keystroke network chatter. Batch within 50 to 200 ms windows for perceived instant updates without overwhelming edge. Tune by measuring median and tail latencies.
Pitfalls and how to avoid them
-
LWW everywhere
- LWW is seductive but wrong for many fields. It discards information and depends on clocks if you misuse it. Use MVReg or domain‑aware merges.
-
Clock skew dependence
- Do not depend on wall clocks for correctness. Use counters, vectors, or lamport timestamps strictly as tiebreakers, never for causality.
-
Over‑centralization at the edge
- Durable Objects are a convenience, not a crutch. Clients must converge without reaching the edge. If your app stalls offline, you have built a thin client.
-
Unbounded document growth
- CRDTs with list positions can grow metadata. Choose list CRDTs that balance locality and ID size, run GC, and snapshot aggressively.
-
Access revocation semantics
- Revocation does not erase previously synced data. To limit access going forward, rotate keys and deny new ops. For stronger guarantees, encrypt per document and re‑encrypt snapshots upon revocation, but communicate the cost to product stakeholders.
-
Multi‑tab and multi‑device actor identity
- Each device should have a stable actor ID. If you produce ops from multiple tabs, either multiplex through a shared worker or assign per‑tab actors with a logical parent device ID.
-
Schema migrations
- CRDT evolution needs forward‑compatible migrations. Version your document schema and support a bounded set of transforms. Run migration on snapshot load, not on every op.
-
Testing only the happy path
- Inject packet duplication, reordering, and drops in CI. Simulate delayed parents. Replay real ops to find corner cases early.
Choosing your stack in 2025
A compact, productive stack for a new local‑first app might be:
-
Client
- SQLite WASM with OPFS for persistence
- Yjs or Automerge for CRDTs
- WebSocket for live sync; background sync API for mobile
- Web Crypto for key management; IndexedDB for key storage
-
Edge
- Cloudflare Durable Objects as per‑doc coordinator
- Turso SQLite for ops and snapshots
- R2 or S3‑compatible blob storage for attachments
-
Core
- Nightly export of ops and snapshots to long‑term storage
- Analytics on derived, sanitized views; keep raw ops private and encrypted
Alternatives also worth evaluating:
- ElectricSQL if you want Postgres as the backbone with client replication
- CR‑SQLite for integrating CRDT semantics directly into SQLite
- Replicache for local‑first with server diffing in a traditional client‑server model
- RxDB or PouchDB if you prefer IndexedDB‑centric stores with sync adapters
Pick one and move. The bigger risks are in unclear conflict semantics and incomplete testing, not in the precise CRDT library.
Operational concerns
-
Observability
- Emit metrics on op ingest rate, backlog per doc, snapshot size, GC time, and client catch‑up latencies. Alert on sustained backlogs.
-
SLOs
- Define SLOs like p95 time to first paint after opening a heavy document, p95 offline operation latency, and p99 time to consistency after reconnect.
-
Backups and disaster recovery
- Take periodic SQLite backups per tenant and test restore. Snapshots are your fast restore path; ops logs provide audit and correctness.
-
Data retention
- Decide how long to keep full ops. Some teams keep a year of ops and older snapshots only. Make it a policy and implement it.
A worked merge policy example
Suppose a task object has fields: title, description, status, assignees, estimate. A reasonable merge policy:
- title: MVReg with deterministic tiebreak; render UX with a gentle conflict indicator and allow users to reconcile manually if needed.
- description: rich‑text CRDT (Yjs); text merges automatically at character granularity.
- status: domain‑aware LWW where done overrides open; once done, only explicit reopen from a newer op flips it back.
- assignees: OR‑Set of user IDs.
- estimate: PN‑Counter to support both increments and decrements across devices.
Document this in your schema and enforce it in your engine or adapter layer. Avoid ad‑hoc if statements scattered throughout your code.
Example: from browser SQLite to edge ledger
A concrete sketch of a client sync loop that persists locally in SQLite and then ships ops in batches to a DO, with exponential backoff and watermark tracking.
tsclass SyncEngine { db: any ws?: WebSocket connected = false watermark: string | null = null constructor(db: any) { this.db = db } async start() { await this.connect() setInterval(() => this.flush(), 100) } async connect() { const url = 'wss://edge.example.com/do/space_1/ws' this.ws = new WebSocket(url) this.ws.onopen = () => { this.connected = true; this.requestCatchup() } this.ws.onclose = () => { this.connected = false; setTimeout(() => this.connect(), 1000) } this.ws.onmessage = evt => this.handleMessage(JSON.parse(evt.data)) } requestCatchup() { if (!this.ws) return this.ws.send(JSON.stringify({ type: 'pull', after: this.watermark })) } async flush() { if (!this.connected || !this.ws) return const rows = await this.db.all('select rowid, op from pending_ops order by rowid asc limit 100') if (rows.length === 0) return const batch = rows.map(r => r.op) this.ws.send(JSON.stringify({ type: 'ops', ops: batch })) // rely on ack to delete; keep idempotent until confirmed } async handleMessage(msg: any) { if (msg.type === 'ack') { // delete up to acked op await this.db.run('delete from pending_ops where rowid <= ?', [msg.rowid]) this.watermark = msg.opId } if (msg.type === 'op') { // apply remote change to CRDT and persist snapshot periodically } } }
This is not production‑ready but shows the pieces: local queue, batch send, ack, and catch‑up. Add backpressure and snapshot fallbacks.
Governance and docs
Write down the following and keep them versioned alongside code:
- CRDT inventory with merge laws per field type
- Op envelope spec and evolution policy
- Snapshot and GC schedule
- Security model: capabilities, keys, encryption, revocation
- Testing plan: properties, fuzzers, replay, and SLOs
When you onboard new engineers, give them a doc that shows how an op travels from a keystroke to the ledger to another device. If they cannot explain it back in 10 minutes, the system might be too clever.
Checklist
-
Data types
- CRDTs chosen per field with documented merge laws
- Entity IDs and actor IDs designed and stable
-
Storage
- Client has durable local DB
- Edge ledger is append‑only with snapshots and compaction
-
Sync
- Ops are idempotent, causally annotated, and batched
- Backpressure and snapshot fallback implemented
-
Security
- Capability tokens enforced at edge
- Per‑space encryption with key rotation plan
- Tenant isolation strategy documented
-
Testing
- Property‑based, differential, and replay tests in CI
- Production ops ledger retained for debugging
-
Operations
- Metrics on backlog, snapshot sizes, GC time, catch‑up latencies
- Backups and restore rehearsals
Conclusion
Local‑first in 2025 is a pragmatic engineering choice that yields a better user experience and a more resilient system. The discipline is to make offline the default and convergence inevitable. CRDTs provide the algebra; edge storage and compute provide the latency benefits; careful IDs and conflict rules provide predictability. Layer in capabilities and encryption for a multi‑tenant world, and invest early in testing merges rather than unit testing endpoints.
You can start small: adopt a CRDT library for collaborative lists, store your operations in an edge SQLite, and gate everything through a per‑document Durable Object. Watch your metrics, snapshot aggressively, and iron out merge policies. The result will ship faster than you think and delight users even when the network does not cooperate.
Further reading and tools
- Automerge: https://automerge.org/
- Yjs: https://yjs.dev/
- Cloudflare Durable Objects: https://developers.cloudflare.com/durable-objects/
- Turso (libSQL): https://turso.tech/
- CR‑SQLite: https://vlcn.io/
- ElectricSQL: https://electric-sql.com/
- Replicache: https://replicache.dev/
- Macaroons: https://research.google/pubs/pub41892/
- fast‑check (property‑based testing for TypeScript): https://github.com/dubzzz/fast-check