Pawel elegantly distills the complex mathematics of CRDTs into an intuitive framework for understanding modern collaborative software. It is a masterclass in technical communication that makes high-level distributed systems concepts accessible without sacrificing depth.
深度探索
先修知识
- 暂无数据。
后续步骤
- 暂无数据。
深度探索
How Collaborative Text Editors Don't Break本站添加:
You've probably been in a Google Doc with four other people typing at once.
Your cursor doesn't jump. Their edits land where they're supposed to. Nothing breaks and nothing freezes. In general, the experience is very smooth. To the computer, it is a terrifying distributed systems nightmare. How do you synchronize state across multiple machines with unpredictable network latency without locking the file? Today, we are looking at the mathematical elegance of CRDTs, conflict-free replicated data types, and how they solve the collaborative editing problem.
To understand why this is so difficult, we have to look at how text editing actually works under the hood. Imagine a shared document that contains the three letters c a t. Two users, Alice and Bob, are editing this document at the exact same time.
Both of them see the exact same word, cat.
Alice wants to change those letters to chat, so her client generates an operation, insert h at index one. Bob wants to change those letters to cats, so his client generates the operation, insert s at index three. Both operations are calculated against the same snapshot they see right now, the word cat. Now the server has to apply them, and here is the problem. Even if it applies them one at a time in a perfectly clean order, the document still breaks.
Alice's operation runs first. The server inserts h at index one. The word is now chat.
But because of that insertion, every letter after it just shifted one position to the right. The letter t is no longer at index two. It is at index three. Then Bob's operation runs. Bob asked, insert s at index three. He picked that number while looking at the original cat, where index three meant after the t. But the server is not looking at cat anymore. It is looking at chat. So it inserts s in front of whatever sits at index three right now, which is t. In the result, they got c h a s t. Alice wanted chat. Bob wanted cats. They got chast. The operations were correct. The order was correct. The problem is that index three in cat and index three in chat mean two different places. Indexes are relative to a snapshot and the snapshot changed underneath Bob's hand. This is called the index shift problem. And it gets worse with deletes. If Alice deletes the A at index one, the string becomes CT.
If Bob's delayed operation to insert S at index three arrives now, index three doesn't even exist anymore. The program crashes or the letter ends up in completely the wrong place. The fundamental issue is the same. Indexes are not stable references. Historically, systems solved this using a central authority. Early collaborative tools used pessimistic locking. If Alice is typing, the document is locked and Bob's keyboard is disabled. This is terrible UX. Later, Google Docs popularized a technique called operational transformation or OT. OT works by having the server intercept every incoming operation, calculate how the indexes have shifted based on other users' actions, mathematically transform the operation, and broadcast it back out. OT works, but it is incredibly complex to implement and typically relies on a central server to act as a source of truth. If you go offline, you can't easily merge your changes later. CRDTs approach the problem from a completely different angle. Instead of relying on a central server to transform operations, CRDTs redesign the underlying data structure so that conflicts are mathematically impossible to begin with.
To achieve this, CRDTs rely on three mathematical properties: commutativity, associativity, and idempotence. In simple terms, it means that no matter what order operations arrive in, and no matter how many times they are applied, the final state will always be exactly the same. How does this work in practice? Real CRDTs, like Yjs, the library behind many local-first editors and collaboration tools give every character a unique ID, not a fractional number. Two simple things: which client typed it, and a counter that ticks up with every operation from that client. Let's go back to our three letters: C, A, T, all typed by Alice. C gets the ID Alice 1, A gets Alice 2, T gets Alice 3. The ID never changes. It is a permanent name for that character. But IDs alone don't tell you what comes after what. So, each character also remembers two things: who was on its left when it was typed, and who was on its right. We call these origin and origin right. C was typed first, so its left is nothing, its right is the empty document. A was typed between C and the end. So, A's left is C, A's right is nothing. T was typed after A. It's left is A, it's right is nothing. The document is now a tiny linked list. C points to A, A points to T. Now, Alice wants to insert an H between C and A. Her client doesn't 1. It records H with origin C and origin right A. That is a coordinate the network can't break. C and A are permanent names. Even if other characters arrive between them later, the description after C before A still makes sense. Alice broadcasts new character H with ID Alice 4, origin C, and origin right A. Now, imagine Bob on a different machine at the exact same moment decides to insert H between C and A as well. Bob broadcasts H with ID Bob 1, origin C, and origin right A. When the messages reach each others machines, both clients see two new characters trying to live in the same gap. Same neighbors, same window. This is where the client ID comes in. The rule is dead simple. When two characters share the same origin window, sort them by client ID. Alphabetically, Alice comes before Bob. So, on every machine, the order is the same. C, then Alice's H, then Bob's H, then A, and then T. It is not because a server picked it, it is because both machines applied the same sorting rule to the same data.
Because every character carries a permanent ID and remembers its origin and origin right, network latency stops mattering. It doesn't matter if Alice's H operation arrives 3 days late. When it lands, the receiving client knows exactly what to do. Find the gap between C and A and insert there.
If Bob's H is already there, the same client ID rule decides the order. Every time, on every machine, no matter when the messages arrive.
But, what about deletions? If we delete a character, we free up its slot. And a delayed insert from another user might be looking for it as its origin or origin right. The reference would point to nothing. CRDTs solve this using tombstones. When you press backspace, the character is not actually removed from the data structure. The CRDT simply changes the variable is deleted to true on that character's ID. The UI sees the variable and hides the character from the screen, but the data structure keeps it. So, if another user's delayed insert references that deleted character, origin A, even though A is gone from the screen, A is still there in memory with its ID available as an anchor. The new character slots in next to the tombstone, and the document stays consistent. Because CRDTs guarantee eventual consistency purely through math, they enable a paradigm called local first software. A user can open an application on an airplane, work offline for 10 hours, make thousands of edits, and the moment they reconnect to Wi-Fi, their CRDT merges flawlessly with the rest of their team's changes. There are no merge conflicts. There is no blocked UI.
Under the hood, modern CRDT libraries heavily compress these unique IDs and tombstones so that documents don't consume infinite RAM, organizing them into highly optimized data structures like doubly-linked list and B-trees.
By changing the data type itself, we eliminate the need for a central coordinator. We move from a fragile system of shifting indexes to an immutable, mathematically sound timeline of events. That is how CRDTs solve the collaborative editing problem.
相关推荐
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 views•2026-06-04
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Instagram accounts got PWNed
EricParker
13K views•2026-06-03











