"New algorithms for collaborative text editing" by Martin Kleppmann (Strange Loop 2023)

Поделиться
HTML-код
  • Опубликовано: 25 ноя 2024

Комментарии • 6

  • @k98killer
    @k98killer Год назад +2

    Interesting. Storing the formatting in a companion data structure is a very elegant solution. I was thinking of doing Markdown; glad someone else did that experiment already. I'll definitely have to rethink how I generate IDs for elements in RGArrays and Causal Trees since UUIDs are too noisy to be easily compressed and create a ton of metadata overhead.

  • @supermajic
    @supermajic Год назад +3

    Interesting, if I wanted a centralised sync server that didn't have knowledge of the contents, I suppose I'd have to forgo compaction or use compaction on batches only from individual clients, sacrificing some real-time. I wonder if that's feasible or in typical applications, server side storage would blow out.

    • @ibgib
      @ibgib Год назад +4

      Around 7:00, he mentions that you can think of their drafts as similar to branches in git. Just like git, you should be able to squash and rewrite history as needed and it doesn't have to be batched "now or never". This stems from it running off of the deltas and applying those diffs a la event sourcing-like CRDTs. Of course your requirements may determine whether or not you need to keep track of those composite diffs or not...like how much do you care about staying close to the history at what level of granularity for auditing or security purposes.
      Ultimately, you can think of today's text files (those that do not persist undo histories) as fully compacted. Interestingly, if you maintain all edits in history (like keeping around old branches in git), you can start thinking of the end product like this as actually just a view/projection of the same data, similar to a release commit tag in git, and you can compose from multiple branches as you see fit.
      Note that I'm not an Upwelling/InkAndSwitch person so this is all IMO, but I've been working on related tech for many years now. The speaker MK has several other very interesting videos including his previous append only logging experiences as well as one specifically on Automerge.

    • @k98killer
      @k98killer Год назад

      ​@@ibgibI saw a bunch of MK's videos explaining his work just before I wrote my own CRDT library in Python -- they were pretty inspirational, though I took my work in a different direction: whereas MK's work is focused on optimizing a specific CRDT (a causal tree built on top of a replicated growable array), I implemented a dozen CRDTs, about half of which are compositions or views on top of more primitive CRDTs, for the sake of broader experimentation.

  • @davidglaubman6341
    @davidglaubman6341 Год назад

    This is great, Martin!
    The design choices and what they enabled somehow reminded me of event sourcing. Is there a relationship between CRDTs and event sourcing?

    • @k98killer
      @k98killer Год назад +2

      CRDTs essentially use a stream of events to encode state. The main distinguishing feature of a CRDT is that you get the same state no matter the order in which you apply events or number of times a given event is applied so long as every event is applied at least once. At least, that's how current CRDTs function -- older algorithms were not as reliable under adverse conditions.