#odin_lang I made a rope a few months ago, I had written down a presentation of it, but did not record until today. The video shows what it is and how it works, with a quick overview of my code.
About those asserts: I would also invest some time to develop a fuzz tester. Fuzzer is a program that generates bunch of random inputs (bytes, really) and feeds them into your program. Based on their value you then add, remove or replace a tree node. Then, you check for crashes or asserts being triggered, memory leaks, uninitialized variables, buffer overflows, ..., you name it. You could also do those tree operations simultaneously with another known good tree implementation and compare both trees.
I'm guilty of not writing helpful utility functions upfront as well (in your case, the tree printing function). It's tempting to work in the dark but it pays off to have your utilities and tests for really difficult problems like you were tackling here.
I thought I was the only one guilty of this. I have found that outputting to graphviz format, can really help, as it draws the tree for you, from your textual representation.
I wonder if you can improve performance by replacing the tree of independently allocated objects and pointers with an array of nodes. Much like how you can encode a binary tree in an array. The stringy leaves would still be heap allocated.
Ropes are used by javascript v8 for each string. The problem with text buffers or just linear strings in this case is that they would trash memory usage if you use large strings even more. Javascript programming is dirty. Also in other languages you can build immutable string structure that still share a part of the data and just add spans to the code. The only reuseable C++ rope implementation i saw was in the early 2000s. And this RogueWave library did not survive the tides of time. I wrote a good code text editor once and i have to say the most important thing for me was to have a simple one string per line array. And take the not to underestimate task of making cross line text operations useable on them. All different kind of meta data is line based and best to stored with them. Things like attributed strings do not work at all when the text becomes a little bit more complex. And if it is not a code editor (say markdown), then just break it into an array of paragraph strings.
And implementing a good rope datastructure for an editor. I assume more that it takes at least 500 hours (3 month) for a good programmer to implement. Remember undo, cloning (for lock free data sharing), searching, regexpr searching. Good luck finding a library where you can feed the rope string spans and don't need to serialize into a normal string first. ....
Indeed, this video was largely intended to be a summary of what it is, and a bit of a ward against diving into it as the main data structure unless it really has a raison de etre.
The tool is called spall. Also written in Odin. There is a free version but also a paid version on itch.io. the free version does 4gb traces but is wasm only.
Your camera view seems to lack contrast. are you sure you're putting some color grading on your raw footage? oh it's probably a webcam, well either way. just something i noticed, it looks like when people put ungraded raw footage in a video.
is this really better? with the number of memory allocated nodes, you'll be getting a cache miss with every access. also with modern CPUs, you can get SIMD mem-moves on the cache which are really fast. heck, even a vector of vectors (line buffers) would likely be faster and easier to implement.
There are use-cases for it, but as stated in the video, i think a gap buffer is way better in most scenarios. I was told this but was hard headed and decided to try it for myself anyway. There were not (m)any good resources to understand ropes in detail, hopefully my video filled the gap (pun intended..? ha ha).
@@mjolnirdev sure, data structures have their use cases, especially if you have a large number of terms that need to be sorted and in the case of rope buffers if you want to do a lot of string sorcery. i'm just trying to justify their use as edit buffers. if memory serves me, i think the helix editor uses them, but rust code corrodes my brain so i haven't looked into it.
About those asserts: I would also invest some time to develop a fuzz tester. Fuzzer is a program that generates bunch of random inputs (bytes, really) and feeds them into your program. Based on their value you then add, remove or replace a tree node. Then, you check for crashes or asserts being triggered, memory leaks, uninitialized variables, buffer overflows, ..., you name it. You could also do those tree operations simultaneously with another known good tree implementation and compare both trees.
Thanks algorithm
Good presentation, watched the whole thing even though IDGAF about making a text editor or probably ever using this :P
I'm guilty of not writing helpful utility functions upfront as well (in your case, the tree printing function). It's tempting to work in the dark but it pays off to have your utilities and tests for really difficult problems like you were tackling here.
I thought I was the only one guilty of this.
I have found that outputting to graphviz format, can really help, as it draws the tree for you, from your textual representation.
I wonder if you can improve performance by replacing the tree of independently allocated objects and pointers with an array of nodes. Much like how you can encode a binary tree in an array. The stringy leaves would still be heap allocated.
Ropes are used by javascript v8 for each string. The problem with text buffers or just linear strings in this case is that they would trash memory usage if you use large strings even more. Javascript programming is dirty. Also in other languages you can build immutable string structure that still share a part of the data and just add spans to the code. The only reuseable C++ rope implementation i saw was in the early 2000s. And this RogueWave library did not survive the tides of time.
I wrote a good code text editor once and i have to say the most important thing for me was to have a simple one string per line array. And take the not to underestimate task of making cross line text operations useable on them. All different kind of meta data is line based and best to stored with them. Things like attributed strings do not work at all when the text becomes a little bit more complex. And if it is not a code editor (say markdown), then just break it into an array of paragraph strings.
And implementing a good rope datastructure for an editor. I assume more that it takes at least 500 hours (3 month) for a good programmer to implement. Remember undo, cloning (for lock free data sharing), searching, regexpr searching. Good luck finding a library where you can feed the rope string spans and don't need to serialize into a normal string first. ....
Indeed, this video was largely intended to be a summary of what it is, and a bit of a ward against diving into it as the main data structure unless it really has a raison de etre.
V8 uses many different string implementations depending on the context
12:45 how do you get this nice profiling visual graph?
The tool is called spall. Also written in Odin. There is a free version but also a paid version on itch.io. the free version does 4gb traces but is wasm only.
Look Gordon, ropes!
Your camera view seems to lack contrast. are you sure you're putting some color grading on your raw footage? oh it's probably a webcam, well either way. just something i noticed, it looks like when people put ungraded raw footage in a video.
it is true - i originally left that way on purpose but folks commented about it multiple times so have since gone back to a graded color
is this really better? with the number of memory allocated nodes, you'll be getting a cache miss with every access.
also with modern CPUs, you can get SIMD mem-moves on the cache which are really fast. heck, even a vector of vectors (line buffers) would likely be faster and easier to implement.
There are use-cases for it, but as stated in the video, i think a gap buffer is way better in most scenarios. I was told this but was hard headed and decided to try it for myself anyway. There were not (m)any good resources to understand ropes in detail, hopefully my video filled the gap (pun intended..? ha ha).
@@mjolnirdev sure, data structures have their use cases, especially if you have a large number of terms that need to be sorted and in the case of rope buffers if you want to do a lot of string sorcery. i'm just trying to justify their use as edit buffers.
if memory serves me, i think the helix editor uses them, but rust code corrodes my brain so i haven't looked into it.
@@androth1502zed also uses them
Do you have a github?
I do. If you look in the bottom left of the titleslide there is a link to the repo.
my brain