I think this whole playlist should be an ideal programming course for any x language. But most of the course end up just reading through the docs and taking us through the syntax of a language. Learning a new language is not about learning a syntax and get done with it, its exactly this, these videos which gives the insight of how to desing your programs in rust. These mental models are what that any new comer can learn and can apply to Rust and any other language. I am really really thankful to you for doing these sets of videos.
Dang, I'm just learning rust coming from a higher level language background and your videos are amazing. They get me thinking in a more nuanced way about these things.
Very nice video! I just want to add something about why rust vec don't use null pointer at 6:17, it's not so about niche optimisations but it's because references can't be null in rust even when pointing to ZST, it must be a well aligned non null pointer, and Vec deref as [T], so even if the vec is empty it must provide this, and if it store a null pointer the deref impl would have an if clause to create dangling but well aligned pointer, and you wan't unecessary branches in a function called in pretty much all vec operations, so it's more about performances than memory usage, pretty much all heap allocated data structure do the same thing for ZST, don't allocate for 0 bytes, but still create a valid reference.
All of this is true! But all of the rules you’re talking about (the strict requirements on the values of references etc.) were created for a reason, which is to allow the compiler to do things like layout optimizations (among other types of optimizations) when it sees references. So I think we are ultimately making the same point, when you consider _why_ Rust doesn’t allow non-null references. Your comment has some great added detail about what Vec has to care about internally; thanks for pointing out the Deref thing, that’s a super interesting point! Edit: eh, I guess there’s that whole “memory safety” thing too or whatever that means references need to be non-null and properly aligned. ;) I’m so used to it I take it for granted.
@@_noisecode sorry if it wasn’t clear, I did not say it was not for niche optimisation, but this optimisation is more an added bonus of the original advantage of a dangling pointer. If I recall correctly don’t C++ also require references to be non null? I’m pretty sure null ref is UB in cpp too
Yeah I think I got what you're saying! I'm saying that it seems to me that the _reason_ that Rust requires e.g. an empty &[T] to still be non-null is for layout optimizations. Rust could've made the design choice that an empty slice was allowed to be null (meaning Vec could store a nullable pointer), since there's nothing to read from anyway; but instead it requires that it dangle so that all zeros remains a special value in every case. (I'm sure there are other reasons besides layout optimizations that this is a good idea, but layout optimizations is a big one.) Your comment is making me think about this more than I ever really have before so I really appreciate the discussion. :) And yeah, null references in C++ are UB. Although IMHO Rust references aren't a direct analog to C++ references; they're somewhere between C++ references, C++ pointers, and a secret third thing (borrow checking).
I love the way this video is structured! It's so well done that at ~12:00 I stopped the video, looked at the code and thought about it myself. And I arrived at the same conclusion you later gave. So basically what a great teacher does: you taught me the basics and with those I was able to figure out the problem myself! :D
This rules. Literally hit this issue in a Rust Advent of Code problem where I thought using `reserve_exact` would speed things up - but instead ground to a halt. Cheers!
Nope, that's a constructor with a specified size, it's default constructing your type (here int, a default constructed int is zero) to make a std::vector of seven zeroes Rust's Vec::with_capacity(N) gets us an empty Vec, but with guaranteed capacity for N items, very different.
In C#, List.EnsureCapacity(int) doubles until the new capacity is equal to or greater than the specified value. It does let you pre-allocate at initialization to a specific value, though (or set capacity explicitly)
Good pointers, reserve() in a loop would definitely almost always be an anti-pattern. The concept that all elements have to be moved during a new alloc (which is most likely handled by a realloc in the array case) is sometimes/often not true, especially if the layout of every item is a multiple of 8 (or word size) and there's still space in the chunk without requiring a sbrk.
Also, past a certain allocation size, any memory allocator worth its salt should be putting your allocation in a separate virtual memory page rather than together in the heap where it puts tiny dynamic allocations. And, once you're at the point of dealing with full 4K pages, said allocator can "move" things by just asking the kernel to change a page's virtual memory address in the page table, no need for copying.
@@AJMansfield1 yeah well that always depends on current heap size (if u cant expand the text/data segment further). I think for malloc on most modern systems the default mmap value is 128kb, anything under that would be in the arena(s). Typically for contiguous memory you wouldnt want a big mmaped alloc since the ideal scenario is l1 caching so ud prolly want ur array to be smaller than 32kib or whatever the data cache size is on ur cpu. (Obviously different scenarios have different needs)
@@TehGettinq if you're doing a contiguous allocation that's larger than a cache line, there's little cache advantage to putting it in a shared heap arena - and a cache line is much smaller than a 4K page. The main performance reason not to use separate pages is because of the need for multiple kernel context switches. Not just the syscall asking for a page, but also for the page fault accessing that page for the first time - since the kernel usually only just makes a note of the fact you asked for one in the mmap call, and holds off on actually mapping in a page of physical memory until you try to write to the page it pinky swore it gave you.
Noo! I thought you'd be a channel with like ten hours of bingeable Rust content, but you are still new and building up your library content. I love these videos, your production quality is great and your explanations are clear :D
Great video! Once again, we turn down to "be aware of what you are using and how it works". It's interesting to me as I have heard the term optimization been described as "knowing your environment and doing the best job you can to optimize for it" and this is the explanation I choose to use. It's funny that a lot of times we try to make optimizations and we don't know our environment very well.
This video was fantastic. I actually laughed when you were noting how confusing it could be and whatnot at 2:23. No worries, I'm here for it. Thanks for linking to other interesting resources, too.
This was a great video. Not for the specific issue, but for how it showcases the way in which complex systems can fail despite everyone's benevolent intentions.
I really liked the video. The animations are great and the explanation is clear and easy to follow. One piece of advice I can give is 'don't say what you're gonna show, just show it'. Don't take this as hate, but you shouldn't say that you will show something later in the video. There are situations where you might want to say that, but in most cases, it's better to shape your script so it flows smoothly and shows topics as you talk about them.
Thank you so much for your videos. I absolutely love the structure of your script, how you first motivate the problem, explain everything on the way, but not in obnoxious detail, and then arrive at your conclusion. Most educational video creator on the internet could learn a great deal from you. Keep it up!
I mean language being "junior-friendly" is not really super beneficial. If that's the point, why bother using lowlevel languages in the first place? You can get pretty much same performance with languages that are easier to write and maintain in, and resort to {insert_low_level_language} bindings whenever you actually need it.
@@noobgam6331 once you work in a collaborative environment & have to work with tons of legacy, having fewer footguns is always a win regardless of your experience. If you've worked with good corporate C++, often the self-imposed constraints look like just rust without official spec/compiler support.
@@cortexauth4094 And the standard probably shouldnʼt (I think; maybe it actually should), but the implementations can all choose to be better. And the main ones are all open-sounce now.
I feel proud of myself, as soon as I clicked I connected the dots and guesses the exact topic. Not a problem I've thought of previously (if I bulk append, I use proper methods, if I push an unknown number of elements, I don't manually reserve), so this was a great heads-up, and the real-world example and precise complexity calculations were great.
I wrote the top-rated (and accepted) answer to the StackOverflow question "How does the capacity of std::vector grow automatically? What is the rate?", and this is an excellent breakdown of what's going on here. And I wish I had put your caveat about `reserve` in my answer. I'd link to it, but RUclips thinks all comments with links are spam. :-/
I read your SO answer multiple times while working on this video and it was very helpful in making sure I got it right. :) Here's the link: stackoverflow.com/a/5232342 (RUclips gives me a pass posting links in comments on my own videos)
This was really, really good. Well presented and articulated. Also an interesting topic, I had no idea about vector optimizations, I just assumed each push was a new allocation, so I absolutely would have made that same mistake
praise the algo for suggestibg rhis one. nitpick: if you replace obe of the early “ill explain it later” with the actual gist like “it boils down to this being linear and not geometric” it would still make me watch the whole vid while knowing youre not draging me along for retention. interestibg topic, thanks for sharing!
C# also have 2 functions for reversing its List type -Setting Capacity will force reallocation of the array to an exact size -Calling EnsureCapacity() will grow the array exponentially till it reach the needed size
3:50 for setting up a TArray to only allocate on the stack, and not touch the heap, you can use the second template argument to specify a TInlineAllocator, with a set number of items. Any number of items added beyond that however do end up on the heap.
This is an awesome tip for everyday Unreal code! SBO vectors are super useful and I sorta wish I'd mentioned them in the video. Though unfortunately this trick only helps to optimize parameter passing if the API is generic over the TArray's allocator, which USplineComponent::AddPoints for example isn't. So the provided TArray has to use the default heap allocator.
@@_noisecode Since it's a const ref, would it not simply use whatever the original TArray allocator has set up for it? As far as I understand at least (and I could be wrong here), once you've added the items with an inline allocator, the TArray simply is keeping track of the pointers to the stack locations of the items. Then the loop inside USplineComponent::AddPoints would just have reference to those locations from the stack.
The allocator is still part of the TArray's type, though. `TArray` is a different type than `TArray` (which, due to a defaulted template argument, is really `TArray`). So passing `TArray` to a parameter of type `const TArray&` won't compile.
If you know you‘ll be inserting multiple times, it makes sense to check if the class you‘re inserting to (like the one with the AddPoints method) exposes a reserve function itself (or a non-const ref to the underlying vector, etc), then you can work around the AddPoints function‘s reserve by simply doing the geometric reserve yourself before you call that function, since reserve will just do nothing if the container is already big enough (at least in C++).
I love these types of videos. Even though I mostly use Javascript and Java day to day it’s still so important to know what’s actually happening under all the abstractions. Good stuff 😊
Excellent video. Though I don't think the conclusion w.r.t. API design is solid. The implementation of AddPoints is pretty much exactly as I was expecting. A higher cost up front, for an exact sized data structure at the end. I'd expect linear time when called (thus exponential when called repeatedly), but minimal wasted memory. Whilst I can see your point about preserving the exponential growth, I think that is pessimization of the most common use case, construction with bulk points. That said, as is often the case, Rust does it right. A low-footguns default with 'reserve' and a method that gives you precise control with 'reserve_exact'. The AddPoints API just calling AddPoint in a loop is quite pointless, there should be a reserve (both exact and with exponential growth) API and then have the user write the loop.
6:08 While it's common for a null pointer to have a bit representation of all zero, that's not what C or C++ specified last time I checked. They do specify that casting an integer zero to a pointer will produce a null pointer, but not what e.g. "void *p; memset(&p, 0, sizeof p);" will leave p pointing at, nor u.p after "union {void *p; uintptr_t n;} u; u.n = 0;"
One thing I noticed looking at and starting to use a Dynamic Array in C is that appending a single element and a bulk of elements *is* the same operations. It's just if you're appending bulk of N or 1. With this knowledge, you can design around bulk only with iterators and everything, and then keep appending 1 as a special operation. This I think communicates well to users. Dynamic Arrays / Vectors are quite possibly one of the best general purpose data types ergonomically. So easy and simple too.
So the smart automatic allocation for growing a buffer with push() I believe is n log2 time complexity? Since it just doubles the size every time (just left shift the capacity integer once and ask for that much memory, feels nice).
One thing not touched on here is that preallocation can be useful when you need to allocate a large number of arrays and you know the size of them is smaller than the minimum default allocation. Obviously it's not so relevant to API implementors. But there are many tradeoffs esp if the array size changes later. In general it's best not to call methods like reserve() unless profiling shows that you benefit from the optimisation
In reply to 3:55... UE has TArrayView... but for some reason it isn't used very often in engine APIs. I have no clue why. I use it as a drop-in replacement for const TArray&, unless I am delegating to an API that requires a TArray.
Yeah it’s a great tool, but yeah, I don’t see it used much (kinda hence my wording about there not being a tool like this in UE’s “core design”). You also see ptr+length pairs used here and there but not too often. You could imagine maybe it’s tricky to support in BlueprintCallable APIs (and potentially opens up Blueprints to a new class of dangling-arrayview bugs?) and maybe it has just never gained much footing in the ecosystem for that reason, though it feels like it could still be used in engine-internal code more than it seems to be.
Sounds like a good rule of thumb would be this: could you replace each AddPoint with AddPoints on a 1-sized array without a performance drop (irrespective of the heap allocation)?
This is a good video that is very accessible to wide audience. On 16:34, though, I think it's far better to use `v.begin() + oldSize` or `v.end()` which are guaranteed to have constant time complexity instead of calling `std::next()` which is opaque and at least looks like it can walk through the iterator in O(n) time.
Wow this is something I have to think about on a nearly daily basis, and the significance of this case never occurred to me. Thanks for the awesome video!
At first I didn't see the "controversy" of the video, because it seemed obvious when understanding the geometric growth idea of vector. But then you pulled out the example of using a bulk append in a loop, and I felt so stupid. I can totally see myself having implemented the bulk_append in that exact way thinking I was clever and then later down the line using it in a loop to add multiple elements without realizing the problem.
I only clicked the video cuz of the cute crab. I was so disappointed the crab did not play a bigger part of the action in this video. Im going to rate this a 2 on IMDB. I want to see more cute crab action in the sequel. Can you makeba crab prequel?
Hm, why it wouldnt be better to resize only if the length is < than the value you want to resize it to and if grown factor was still small? That way would be optimised for both cases.
As an example of growth rate of a vector, Java will multiply by 1.5. An empty ArrayList (don't use Vector, it was a mistake, ArrayList is the one that's actually good) will by default have a capacity of 16. If you wanted a capacity of 7, you'd ask for new ArrayList(7); Oh yeah, and reserve is called ensureCapacity. Just as an example of what Logan is talking about at around 10:00. Different implementations of these ideas give different tradeoffs depending on what you want to accomplish.
It does seem like even just adding in a loop really would be ok, since as long as you’re preserving that exponential growth, it’s amortized constant right? reserving space up front can have benefits, ie you’d know ahead of time if you somehow ran out of memory i guess? It also might help reduce memory fragmentation if you’re doing other operations between those loops to avoid all the smaller growth stages of the array
@14:45 Alternatively, you could call reserve but do it with exponential growth (i.e., reserve the first power of two bigger than the amount you need). This'll work fine both in a loop and not, and it won't lead to any more wasted space than if you'd done nothing (since that's the amount of space just pushing would give you), so it's still a strict upgrade over doing nothing, even if there are some circumstances where it would be more wasteful than reserving the exact amount of space. I assume this is what rust's `reverse` method does, given that it exists distinct from `reserve_exact`? @15:50 Oops, you said exactly that lmao. Though I will note that `extend` relies on the iterator's size_hint method, and in some cases, the programmer may know better how big the iterator is than the iterator can. For example, I think `flat_map` is pretty bad at size_hints, because each individual iterator might be between 0 and infinite length and even if those iterators themselves have exact size hints, the flat_map size hint can't consume the flat map, which would be required to traverse it. But you might pretty easily know the exact size of your flat map iterator if the logic there is actually pretty simple. E.g., if you want to construct every pair of numbers between 0 and a given bound, you could have something like fn pairs(upper: u32) -> impl Iterator { (0..upper).flat_map(|x| (0..upper).map(|y| (x, y)) } then I'm pretty sure size_hint has no idea how big this is, so if you did fn pairs_vec(upper: u32) -> Vec { pairs(upper).collect() } you would probably benefit significantly from instead doing fn pairs_vec(upper: u32) -> Vec { let mut v = Vec::with_capacity(upper * upper); v.extend(pairs(upper)); v } (side note: I know it's kinda niche, but I wish there was a collect_with_capacity method for these cases)
Your `flat_map` example indeed has the size_hint `(0, None)`, so it would use normal vector growing operations. For this specific example, using `itertools` with `(0..8).cartesian_product(0..8)` would result in the correct size_hint `(64, Some(64)`, just for completeness sake. (I really like `itertools` personally and wouldn't want to live without it.) Interestingly enough, `(0..8).combinations_with_replacement()` using `itertools` does not have an accurate size_hint, also returning `(0, None)`. In nightly there at least is the `.collect_into` method for iterators, which in turn only calls extend, but makes it look a little cleaner (at least in my eyes). Unfortunately, using `.take(n)` only changes the upper bound of the size_hint, while `FromIterator` (after a few layers of abstractions) reserves capacity based only on the lower bound. The only "quick fix" I found was with the `.pad_using` method from `itertools` - this needs a closure however which computes the missing elements if necessary. While it is never called if you know the actual size, it is probably still compiled into the binary. There is also the `size_hint` crate, which is very small, but essentially only wraps an iterator with a given size hint (more or less what you want), which is returned as the lower bound when `.size_hint` is called. If such a dependency should be used is another question in and of itself, and I very much agree that a `.collect_with_capacity` method or some other workaround would be better, either in `std` or in the `itertools` crate, where it would make sense.
@@daskadse769 Yeah I wasn't super clear; flat_map does have a size hint because every iterator must, it's just the flat_map size hint contains zero information. It literally just says "the iterator has a length between 0 and infinity." And yeah, the specific example has cartesian product (though notably, I don't actually know if that's a zero cost abstraction in this case, could be interesting to benchmark) to do it with a size hint, but the point I'm making is that under some conditions, you do know much better than the iterator does how big it is. The "size hint" crate is neat though, something like that in itertools or std would be nice. Works better than `collect_with_capacity` because that doesn't generalize as well over different types of collections and would require additional implementations (though I suppose it could come with a default implementation that just invokes the current implementation and ignores the size hint).
what if the unreal function called reserve with double the size and only if the current capacity + the elements you want to add is greater than some threshold this would be fine to call in a loop
Since C++'s reserve can allocate more than the requested size wouldn't it be better to implement a version where the reserved size is always a power of 2? For example, if you call reserve(9) you get a capacity of 16. This way you get an exponential growth of the vector and avoid this problem. Of course, this means the vector won't be optimized memory-wise but this is one of the assumptions with vectors in C++, that the capacity is usually larger than the size. Maybe a new reserve method can be defined in the C++ standard with this exact behaviour.
@9:30 shouldn't that be O(log n) complexity, since log(n) of the pushes will trigger a memory reallocation? Calling it O(1)+ sounds like a marketing gimmick ;)
Cool video. I'm more of a C dev and don't always know C++ oddities, so it was weird learning that memory allocated with new or new[] can't be reallocated. It makes sense, but it's a shame that such a simple, massive optimisation for vectors isn't available.
even if you call realloc in c, it would most likely allocate a new memblock and copy the contents of the old block to the new one. It really depends on the malloc implementation tho, some allocators allocate in blocks of 32/64bytes and can be efficiently reallocated, some dont, some allocate in block only for small enough block sizes and use unpadded allocations for larger blocks. In short, it depends. So always assume that realloc shall make a new allocation. That's actually the reason, why it returns you a pointer, even if it might be the same pointer you just used: cuz it's most likely, that it would return a new pointer instead
This is why I don't believe the C developers that claim they're "seeing the real costs" and that C "isn't hiding anything" - your libraries are still hiding plenty. Your compiler optimizer is rewriting everything you write. Even writing assembly, the ops you have are all microcoded and pipelined and speculatively executed - there's no *real* bare metal anymore (maybe the FPGA people?), so you're just picking the language you like most, just like everyone else. Which is fine, just don't get smug about it!
@@SimonBuchanNzHi, FPGA Person here. Just wanted to point out that the synthesis tools do most of the heavy lifting, optimizations, dead code elimination, hardware mapping, timing analysis.... But I mean, you can kick down this can as far as you want. You will always get closer, c is closer than c++, assembly closer than c, and so on...
@@CheaterCodes I assume you *can* pick and place gates individually if you're an idiot, though? I've barely looked at hardware design stuff - Verilog and VHDL seem totally crazy to me - so I still think of you lot as "the real space wizards". Considering the non-stop pace of hardware security vulnerability reports, perhaps I should temper my idolatry 😄
What exactly do you mean with "can't be reallocated"? do you simply mean you can't call realloc on that pointer? why would you? What you would want is to instead call realloc on a pointer to the data which that object contains, and update any members which need to be changed to keep the object in a valid state. To make this less error prone you abstract that into functions. In the end it does the same you can do in c just in a safer package (e.g. when naturally using the object you won't be able to realloc the data and then forget to update its length variable). The problem in the video also still happens in c when you simply use realloc in a function you hand a T** or a pointer to a struct to and then call that function in a loop, or use realloc in a loop directly. Just C++ and rust do the speed up of amortized costs automatically when using something like std::vector while in C you'd need to use libraries that do it or write it yourself.
It's a good point, but why insert such small array into the spline? Can't you just run the loop and store the items in a new array before sending to spline, or just use the addpoint multiple times instead?
Vec::with_capacity creates a new Vec, whereas .reserve_exact operates on an existing Vec. Vec::with_capacity(n) is equivalent to: { let mut v = Vec::new(); v.reserve_exact(n); v } Edit: actually, Vec's docs are inconsistent about whether with_capacity gives you exactly the requested capacity or whether it might give you a little more. Best way to know for sure would be to dive in and read the implementation!
I just love when large C++ projects implements their own data structures of which there are ones provided by the language, making it impossible to get any benefits from optimizations. I understand it can be a compatibility thing, but Rust makes it much easier to avoid this problem; need an iterator that conforms to other iterators in the language? implement the traits. This makes it so easy to integrate different libraries together with the language without having to work with 500 different types and conversions in your main logic.
I do generally agree with you. Though, the thing that bugs me most about TArray isn't its mere existence--it's defensible that they would want to take full control over their vector implementation and tune it to their exact needs--it's the fact that the API is very different from std::vector. Methods are capitalized, things have different names (e.g. .resize() is called .SetNum()), iteration is kind of strange. It puts a needless barrier, both mentally and syntactically (which is annoying for e.g. templates), in the way of interop. Other libraries like Boost.Container that provide container alternatives do so while trying to mimic the standard API as closely as reasonable, which is a much better approach IMO. In Unreal's defense, though, as a codebase it literally predates the C++ standard. There was truly no standard library to use when the project was started. I recall a forum post where Tim Sweeney explains that the reason they rolled their own containers was largely historical and due to this fact. In that same post he mentioned that it might be a reasonable medium- to long-term goal to migrate UE to use the standard containers.
You know, all of this could be handled reasonably by putting the reserve inside an if statement, where it will only run reserve if it the target number is at least a constant factor (typically 2) larger than the current len. If it is lower than that, either you would already have enough space preallocated, or you would run a similarly expensive reallocation just once, as if you had run reserve. This approach mostly gives you the best of both worlds, without the risks talked about here, and it is super easy to implement.
Yeah probably an oversight since they imagined that people would be using the insert_multiple() type function to do a true bulk insert and not just inserting small arrays incrementally. So I really can't blame the developers just one of those weird use cases they probably never envisioned happening.
It might be worth a little bit of constant overhead to allocate *at least* a factor of the current vec capacity, but if the numbers of objects you're adding is more than that factor then allocate some extra in the same allocation. Also just exposing a reserve function for the collection in the API could be useful
I liked this video quite a bit, but wasn't a huge fan of interchangeably using 'geometric' and 'exponential' as they are different rates of growth. Specifically, what we're most familiar with is geometric growth, that being a^n. However, sometimes you'll see algorithms with truly exponential growth, which are n^a. They go back and forth, but for greater values of n, geometric complexity is proportionally greater than exponential complexity.
examples of how to get geometric growth back onto reserve: template void reserve_geometric(std::vector& vec, size_t n){ vec.reserve(std::max(n, vec.capacity() * 2)); } template void reserve_geometric(std::vector& vec_to, std::vector& vec_from){ vec_to.reserve(std::max(vec_to.size() + vec_from.size(), vec_to.capacity() * 2)); } I'm not saying it's better than the insert method shown in the video. I'm just giving example of how trivial it is to e.g. modify the reserve line in Unity AddPoints() to get this behaviour, if you want it. It's probably better than the method in the video which uses .resize() to grow geometrically because .resize() default constructs all the elements which might be expensive, depending on the type.
I love that you say BOTH C++ and Rust are your favorite programming languages.
2:30 This is such a good preemptive clarification.
Really shows you’re a good educator.
PLEASE keep making videos, I can't get enough of them. You're brilliant!
I think this whole playlist should be an ideal programming course for any x language. But most of the course end up just reading through the docs and taking us through the syntax of a language.
Learning a new language is not about learning a syntax and get done with it, its exactly this, these videos which gives the insight of how to desing your programs in rust. These mental models are what that any new comer can learn and can apply to Rust and any other language.
I am really really thankful to you for doing these sets of videos.
Dang, I'm just learning rust coming from a higher level language background and your videos are amazing. They get me thinking in a more nuanced way about these things.
Very nice video! I just want to add something about why rust vec don't use null pointer at 6:17, it's not so about niche optimisations but it's because references can't be null in rust even when pointing to ZST, it must be a well aligned non null pointer, and Vec deref as [T], so even if the vec is empty it must provide this, and if it store a null pointer the deref impl would have an if clause to create dangling but well aligned pointer, and you wan't unecessary branches in a function called in pretty much all vec operations, so it's more about performances than memory usage, pretty much all heap allocated data structure do the same thing for ZST, don't allocate for 0 bytes, but still create a valid reference.
All of this is true! But all of the rules you’re talking about (the strict requirements on the values of references etc.) were created for a reason, which is to allow the compiler to do things like layout optimizations (among other types of optimizations) when it sees references. So I think we are ultimately making the same point, when you consider _why_ Rust doesn’t allow non-null references. Your comment has some great added detail about what Vec has to care about internally; thanks for pointing out the Deref thing, that’s a super interesting point!
Edit: eh, I guess there’s that whole “memory safety” thing too or whatever that means references need to be non-null and properly aligned. ;) I’m so used to it I take it for granted.
@@_noisecode sorry if it wasn’t clear, I did not say it was not for niche optimisation, but this optimisation is more an added bonus of the original advantage of a dangling pointer. If I recall correctly don’t C++ also require references to be non null? I’m pretty sure null ref is UB in cpp too
Yeah I think I got what you're saying! I'm saying that it seems to me that the _reason_ that Rust requires e.g. an empty &[T] to still be non-null is for layout optimizations. Rust could've made the design choice that an empty slice was allowed to be null (meaning Vec could store a nullable pointer), since there's nothing to read from anyway; but instead it requires that it dangle so that all zeros remains a special value in every case. (I'm sure there are other reasons besides layout optimizations that this is a good idea, but layout optimizations is a big one.) Your comment is making me think about this more than I ever really have before so I really appreciate the discussion. :)
And yeah, null references in C++ are UB. Although IMHO Rust references aren't a direct analog to C++ references; they're somewhere between C++ references, C++ pointers, and a secret third thing (borrow checking).
I love the way this video is structured!
It's so well done that at ~12:00 I stopped the video, looked at the code and thought about it myself. And I arrived at the same conclusion you later gave.
So basically what a great teacher does: you taught me the basics and with those I was able to figure out the problem myself! :D
Incredible, the realization hit me at the exact same time. Really great video and way of explaining things
same. He gave the perfect example to make everyone watching the video go 'ohhh' before he showed the conclusion
This rules. Literally hit this issue in a Rust Advent of Code problem where I thought using `reserve_exact` would speed things up - but instead ground to a halt. Cheers!
8:11 you just made me understand that concept in 2 minutes, what a semesters worth of algorithm teachings could not.
The best RUclipsrs seem to have the fewest subscribers.. What great content, keep it up!
10:35 std::vector also has a constructor with a specified capacity:
std::vector v(7);
Nope, that's a constructor with a specified size, it's default constructing your type (here int, a default constructed int is zero) to make a std::vector of seven zeroes
Rust's Vec::with_capacity(N) gets us an empty Vec, but with guaranteed capacity for N items, very different.
@@tialaramexyou're saying like rusts approach is any better.
In C#, List.EnsureCapacity(int) doubles until the new capacity is equal to or greater than the specified value. It does let you pre-allocate at initialization to a specific value, though (or set capacity explicitly)
Good pointers, reserve() in a loop would definitely almost always be an anti-pattern. The concept that all elements have to be moved during a new alloc (which is most likely handled by a realloc in the array case) is sometimes/often not true, especially if the layout of every item is a multiple of 8 (or word size) and there's still space in the chunk without requiring a sbrk.
Also, past a certain allocation size, any memory allocator worth its salt should be putting your allocation in a separate virtual memory page rather than together in the heap where it puts tiny dynamic allocations. And, once you're at the point of dealing with full 4K pages, said allocator can "move" things by just asking the kernel to change a page's virtual memory address in the page table, no need for copying.
@@AJMansfield1 Man, what doesn't the kernel do? Kernel appreciation day anyone?
@@AJMansfield1 yeah well that always depends on current heap size (if u cant expand the text/data segment further). I think for malloc on most modern systems the default mmap value is 128kb, anything under that would be in the arena(s). Typically for contiguous memory you wouldnt want a big mmaped alloc since the ideal scenario is l1 caching so ud prolly want ur array to be smaller than 32kib or whatever the data cache size is on ur cpu. (Obviously different scenarios have different needs)
@@coarse_snad Everyday is kernel appreciation day
@@TehGettinq if you're doing a contiguous allocation that's larger than a cache line, there's little cache advantage to putting it in a shared heap arena - and a cache line is much smaller than a 4K page. The main performance reason not to use separate pages is because of the need for multiple kernel context switches. Not just the syscall asking for a page, but also for the page fault accessing that page for the first time - since the kernel usually only just makes a note of the fact you asked for one in the mmap call, and holds off on actually mapping in a page of physical memory until you try to write to the page it pinky swore it gave you.
I come back to your channel repeatedly -- intuitively, it's one of the best ways to learn coding skills. Please keep it coming.
Noo! I thought you'd be a channel with like ten hours of bingeable Rust content, but you are still new and building up your library content.
I love these videos, your production quality is great and your explanations are clear :D
Great video! Once again, we turn down to "be aware of what you are using and how it works".
It's interesting to me as I have heard the term optimization been described as "knowing your environment and doing the best job you can to optimize for it" and this is the explanation I choose to use.
It's funny that a lot of times we try to make optimizations and we don't know our environment very well.
This video was fantastic. I actually laughed when you were noting how confusing it could be and whatnot at 2:23. No worries, I'm here for it. Thanks for linking to other interesting resources, too.
This was a great video.
Not for the specific issue, but for how it showcases the way in which complex systems can fail despite everyone's benevolent intentions.
I'm glad this got recommended to me, I hope your channel grows and attracts more interested viewers. Peace!
I really liked the video. The animations are great and the explanation is clear and easy to follow. One piece of advice I can give is 'don't say what you're gonna show, just show it'. Don't take this as hate, but you shouldn't say that you will show something later in the video. There are situations where you might want to say that, but in most cases, it's better to shape your script so it flows smoothly and shows topics as you talk about them.
Man, you've been putting out great content. I'm only a Rust beginner, but I love the insights. Looking forward to more videos.
Dude... I swear this is the best coding channel I have ever seen. And I base that solely on this single video.
amazing video, made me want to learn rust + beautiful usage of manim
Thank you so much for your videos. I absolutely love the structure of your script, how you first motivate the problem, explain everything on the way, but not in obnoxious detail, and then arrive at your conclusion. Most educational video creator on the internet could learn a great deal from you.
Keep it up!
I think it's great that rust chooses the right defaults to prevent issues like these.
+1, I'm new to it, but rust seems to have just so many of these subtle choices correct.
not really a C++ language problem, but implementation issue. The standard never imposed that sort of thing
I mean language being "junior-friendly" is not really super beneficial. If that's the point, why bother using lowlevel languages in the first place? You can get pretty much same performance with languages that are easier to write and maintain in, and resort to {insert_low_level_language} bindings whenever you actually need it.
@@noobgam6331 once you work in a collaborative environment & have to work with tons of legacy, having fewer footguns is always a win regardless of your experience.
If you've worked with good corporate C++, often the self-imposed constraints look like just rust without official spec/compiler support.
@@cortexauth4094 And the standard probably shouldnʼt (I think; maybe it actually should), but the implementations can all choose to be better. And the main ones are all open-sounce now.
I feel proud of myself, as soon as I clicked I connected the dots and guesses the exact topic. Not a problem I've thought of previously (if I bulk append, I use proper methods, if I push an unknown number of elements, I don't manually reserve), so this was a great heads-up, and the real-world example and precise complexity calculations were great.
Nice explanation, especially how we can avoid the problem by calling the right function on the vector that adds all elements at once!
I wrote the top-rated (and accepted) answer to the StackOverflow question "How does the capacity of std::vector grow automatically? What is the rate?", and this is an excellent breakdown of what's going on here. And I wish I had put your caveat about `reserve` in my answer.
I'd link to it, but RUclips thinks all comments with links are spam. :-/
I read your SO answer multiple times while working on this video and it was very helpful in making sure I got it right. :) Here's the link: stackoverflow.com/a/5232342
(RUclips gives me a pass posting links in comments on my own videos)
@@_noisecode - I'm glad it was helpful! 🙂
really good video. you're such a good presenter and very eloquently showcased the issue and the solution.
The growth factor of vector in MSVC is 1.5x. It's 2x with GCC (libstdc++) and Clang (libc++)..
This was really, really good. Well presented and articulated. Also an interesting topic, I had no idea about vector optimizations, I just assumed each push was a new allocation, so I absolutely would have made that same mistake
You MUST make more video like this; On writing efficient C++ code, common and niche pitfalls, and best practices.
Very nice video and sensible advice! Thanks!
praise the algo for suggestibg rhis one. nitpick: if you replace obe of the early “ill explain it later” with the actual gist like “it boils down to this being linear and not geometric” it would still make me watch the whole vid while knowing youre not draging me along for retention. interestibg topic, thanks for sharing!
Excellent, these days I'm mostly programming in C#, and this applies as well 👍
C# also have 2 functions for reversing its List type
-Setting Capacity will force reallocation of the array to an exact size
-Calling EnsureCapacity() will grow the array exponentially till it reach the needed size
finally graced again with the brilliance of the greatest youtuber of all time
Heck yeah. Been looking forward to the next video like I've never done for any other channel.
I love your videos!
This actually came up for me a few weeks ago!
3:50 for setting up a TArray to only allocate on the stack, and not touch the heap, you can use the second template argument to specify a TInlineAllocator, with a set number of items. Any number of items added beyond that however do end up on the heap.
This is an awesome tip for everyday Unreal code! SBO vectors are super useful and I sorta wish I'd mentioned them in the video. Though unfortunately this trick only helps to optimize parameter passing if the API is generic over the TArray's allocator, which USplineComponent::AddPoints for example isn't. So the provided TArray has to use the default heap allocator.
@@_noisecode Since it's a const ref, would it not simply use whatever the original TArray allocator has set up for it? As far as I understand at least (and I could be wrong here), once you've added the items with an inline allocator, the TArray simply is keeping track of the pointers to the stack locations of the items. Then the loop inside USplineComponent::AddPoints would just have reference to those locations from the stack.
The allocator is still part of the TArray's type, though. `TArray` is a different type than `TArray` (which, due to a defaulted template argument, is really `TArray`). So passing `TArray` to a parameter of type `const TArray&` won't compile.
If you know you‘ll be inserting multiple times, it makes sense to check if the class you‘re inserting to (like the one with the AddPoints method) exposes a reserve function itself (or a non-const ref to the underlying vector, etc), then you can work around the AddPoints function‘s reserve by simply doing the geometric reserve yourself before you call that function, since reserve will just do nothing if the container is already big enough (at least in C++).
im a roblox developer and what is this
you develop roblox or on roblox?
Nice opossum
For me, rust is easier than making roblox games lmao
@@xijnin and its not a huge scam designed for tricking children
I'm a javascript hobbyist I understand just fine what he's saying. In fact I might have thought of similar issues before.
Amazing videos! Easy to understand, good food of thought and amazingly straightforward to make my programming better
Great videos, love the visualisations
Really like how you explain things and also the visuals are really helpful
woah a lot of times i watch stuff like this most of it goes over my head, this is so well explained thank you
Your videos are so much more informative and less religious than No Boilerplate's. Keep up the good work
I think you’ve found your niche to get recommended by the RUclips algorithm! Keep it up!
More than anything I love that you're a Rust content creator who doesn't talk down to me 😅😭
I love these types of videos. Even though I mostly use Javascript and Java day to day it’s still so important to know what’s actually happening under all the abstractions. Good stuff 😊
This is so good! Incredibly well made and informative and I too think these two languages are king (:
Amazing video. Great edits and explanations.
Excellent video. Though I don't think the conclusion w.r.t. API design is solid. The implementation of AddPoints is pretty much exactly as I was expecting. A higher cost up front, for an exact sized data structure at the end.
I'd expect linear time when called (thus exponential when called repeatedly), but minimal wasted memory.
Whilst I can see your point about preserving the exponential growth, I think that is pessimization of the most common use case, construction with bulk points.
That said, as is often the case, Rust does it right. A low-footguns default with 'reserve' and a method that gives you precise control with 'reserve_exact'.
The AddPoints API just calling AddPoint in a loop is quite pointless, there should be a reserve (both exact and with exponential growth) API and then have the user write the loop.
I always try to reserve a lot upfront [ie Vec::with_capacity], then let the dyn array do it's thing instead of reserve in a loop.
its*
babe, wake up, new Logan Smith video just dropped
6:08 While it's common for a null pointer to have a bit representation of all zero, that's not what C or C++ specified last time I checked. They do specify that casting an integer zero to a pointer will produce a null pointer, but not what e.g. "void *p; memset(&p, 0, sizeof p);" will leave p pointing at, nor u.p after "union {void *p; uintptr_t n;} u; u.n = 0;"
You're 100% correct--the bit pattern of a null pointer is not technically specified. In practice it's all zeros.
One thing I noticed looking at and starting to use a Dynamic Array in C is that appending a single element and a bulk of elements *is* the same operations. It's just if you're appending bulk of N or 1.
With this knowledge, you can design around bulk only with iterators and everything, and then keep appending 1 as a special operation. This I think communicates well to users.
Dynamic Arrays / Vectors are quite possibly one of the best general purpose data types ergonomically. So easy and simple too.
This is really great, I learn a lot from your videos
I read the title as reVERSE and I was so confused xdd. Great vid nonetheless
So the smart automatic allocation for growing a buffer with push() I believe is n log2 time complexity? Since it just doubles the size every time (just left shift the capacity integer once and ask for that much memory, feels nice).
One thing not touched on here is that preallocation can be useful when you need to allocate a large number of arrays and you know the size of them is smaller than the minimum default allocation. Obviously it's not so relevant to API implementors.
But there are many tradeoffs esp if the array size changes later. In general it's best not to call methods like reserve() unless profiling shows that you benefit from the optimisation
Thoroughly enjoyable video with good knowledge to share
In reply to 3:55... UE has TArrayView... but for some reason it isn't used very often in engine APIs. I have no clue why. I use it as a drop-in replacement for const TArray&, unless I am delegating to an API that requires a TArray.
Yeah it’s a great tool, but yeah, I don’t see it used much (kinda hence my wording about there not being a tool like this in UE’s “core design”). You also see ptr+length pairs used here and there but not too often. You could imagine maybe it’s tricky to support in BlueprintCallable APIs (and potentially opens up Blueprints to a new class of dangling-arrayview bugs?) and maybe it has just never gained much footing in the ecosystem for that reason, though it feels like it could still be used in engine-internal code more than it seems to be.
Really grate video!
At the beginning i was like how to use reserve is a bad idea, now i understand!
great*
Sounds like a good rule of thumb would be this: could you replace each AddPoint with AddPoints on a 1-sized array without a performance drop (irrespective of the heap allocation)?
This video gave me life. Thank you sir. Instant sub
This video is perfect!
This is a good video that is very accessible to wide audience.
On 16:34, though, I think it's far better to use `v.begin() + oldSize` or `v.end()` which are guaranteed to have constant time complexity instead of calling `std::next()` which is opaque and at least looks like it can walk through the iterator in O(n) time.
And even better to use std::back_inserter(v) and avoid these shenanigans.
So awesomely explained
Wow this is something I have to think about on a nearly daily basis, and the significance of this case never occurred to me. Thanks for the awesome video!
At first I didn't see the "controversy" of the video, because it seemed obvious when understanding the geometric growth idea of vector. But then you pulled out the example of using a bulk append in a loop, and I felt so stupid. I can totally see myself having implemented the bulk_append in that exact way thinking I was clever and then later down the line using it in a loop to add multiple elements without realizing the problem.
11:40 how did u do syntax highlight for pseudocode??
Did you investigate if its possible to add clippy-lint for forbid calling reserve_exact in a loop?
Randomly came across this video and enjoy going over some fundamentals.
This is great! Thank you for informing me of this!
I only clicked the video cuz of the cute crab. I was so disappointed the crab did not play a bigger part of the action in this video. Im going to rate this a 2 on IMDB. I want to see more cute crab action in the sequel. Can you makeba crab prequel?
Hm, why it wouldnt be better to resize only if the length is < than the value you want to resize it to and if grown factor was still small?
That way would be optimised for both cases.
10:34 “Feed two birds with one scone.” 😂
dude,,,, what a BANGER
really great video!
Love the video! Learned something new :D
As an example of growth rate of a vector, Java will multiply by 1.5.
An empty ArrayList (don't use Vector, it was a mistake, ArrayList is the one that's actually good) will by default have a capacity of 16.
If you wanted a capacity of 7, you'd ask for new ArrayList(7);
Oh yeah, and reserve is called ensureCapacity.
Just as an example of what Logan is talking about at around 10:00. Different implementations of these ideas give different tradeoffs depending on what you want to accomplish.
10 years of both high and low level programming and I've never thought of this situation
How is the reserve fn different from creating an empty vector with a capacity?
It does seem like even just adding in a loop really would be ok, since as long as you’re preserving that exponential growth, it’s amortized constant right?
reserving space up front can have benefits, ie you’d know ahead of time if you somehow ran out of memory i guess? It also might help reduce memory fragmentation if you’re doing other operations between those loops to avoid all the smaller growth stages of the array
@14:45 Alternatively, you could call reserve but do it with exponential growth (i.e., reserve the first power of two bigger than the amount you need). This'll work fine both in a loop and not, and it won't lead to any more wasted space than if you'd done nothing (since that's the amount of space just pushing would give you), so it's still a strict upgrade over doing nothing, even if there are some circumstances where it would be more wasteful than reserving the exact amount of space. I assume this is what rust's `reverse` method does, given that it exists distinct from `reserve_exact`?
@15:50 Oops, you said exactly that lmao. Though I will note that `extend` relies on the iterator's size_hint method, and in some cases, the programmer may know better how big the iterator is than the iterator can. For example, I think `flat_map` is pretty bad at size_hints, because each individual iterator might be between 0 and infinite length and even if those iterators themselves have exact size hints, the flat_map size hint can't consume the flat map, which would be required to traverse it. But you might pretty easily know the exact size of your flat map iterator if the logic there is actually pretty simple. E.g., if you want to construct every pair of numbers between 0 and a given bound, you could have something like
fn pairs(upper: u32) -> impl Iterator {
(0..upper).flat_map(|x| (0..upper).map(|y| (x, y))
}
then I'm pretty sure size_hint has no idea how big this is, so if you did
fn pairs_vec(upper: u32) -> Vec {
pairs(upper).collect()
}
you would probably benefit significantly from instead doing
fn pairs_vec(upper: u32) -> Vec {
let mut v = Vec::with_capacity(upper * upper);
v.extend(pairs(upper));
v
}
(side note: I know it's kinda niche, but I wish there was a collect_with_capacity method for these cases)
Rust’s std::vec can also tell you its current capacity so you can estimate if the vec will grow. I don’t know about the C++ STL though.
Your `flat_map` example indeed has the size_hint `(0, None)`, so it would use normal vector growing operations.
For this specific example, using `itertools` with `(0..8).cartesian_product(0..8)` would result in the correct size_hint `(64, Some(64)`, just for completeness sake. (I really like `itertools` personally and wouldn't want to live without it.) Interestingly enough, `(0..8).combinations_with_replacement()` using `itertools` does not have an accurate size_hint, also returning `(0, None)`.
In nightly there at least is the `.collect_into` method for iterators, which in turn only calls extend, but makes it look a little cleaner (at least in my eyes).
Unfortunately, using `.take(n)` only changes the upper bound of the size_hint, while `FromIterator` (after a few layers of abstractions) reserves capacity based only on the lower bound.
The only "quick fix" I found was with the `.pad_using` method from `itertools` - this needs a closure however which computes the missing elements if necessary. While it is never called if you know the actual size, it is probably still compiled into the binary.
There is also the `size_hint` crate, which is very small, but essentially only wraps an iterator with a given size hint (more or less what you want), which is returned as the lower bound when `.size_hint` is called. If such a dependency should be used is another question in and of itself, and I very much agree that a `.collect_with_capacity` method or some other workaround would be better, either in `std` or in the `itertools` crate, where it would make sense.
@@daskadse769 Yeah I wasn't super clear; flat_map does have a size hint because every iterator must, it's just the flat_map size hint contains zero information. It literally just says "the iterator has a length between 0 and infinity." And yeah, the specific example has cartesian product (though notably, I don't actually know if that's a zero cost abstraction in this case, could be interesting to benchmark) to do it with a size hint, but the point I'm making is that under some conditions, you do know much better than the iterator does how big it is. The "size hint" crate is neat though, something like that in itertools or std would be nice. Works better than `collect_with_capacity` because that doesn't generalize as well over different types of collections and would require additional implementations (though I suppose it could come with a default implementation that just invokes the current implementation and ignores the size hint).
what if the unreal function called reserve with double the size and only if the current capacity + the elements you want to add is greater than some threshold
this would be fine to call in a loop
What about calling resize() in a loop?
Since C++'s reserve can allocate more than the requested size wouldn't it be better to implement a version where the reserved size is always a power of 2? For example, if you call reserve(9) you get a capacity of 16. This way you get an exponential growth of the vector and avoid this problem. Of course, this means the vector won't be optimized memory-wise but this is one of the assumptions with vectors in C++, that the capacity is usually larger than the size. Maybe a new reserve method can be defined in the C++ standard with this exact behaviour.
@9:30 shouldn't that be O(log n) complexity, since log(n) of the pushes will trigger a memory reallocation? Calling it O(1)+ sounds like a marketing gimmick ;)
Cool video. I'm more of a C dev and don't always know C++ oddities, so it was weird learning that memory allocated with new or new[] can't be reallocated. It makes sense, but it's a shame that such a simple, massive optimisation for vectors isn't available.
even if you call realloc in c, it would most likely allocate a new memblock and copy the contents of the old block to the new one. It really depends on the malloc implementation tho, some allocators allocate in blocks of 32/64bytes and can be efficiently reallocated, some dont, some allocate in block only for small enough block sizes and use unpadded allocations for larger blocks. In short, it depends. So always assume that realloc shall make a new allocation. That's actually the reason, why it returns you a pointer, even if it might be the same pointer you just used: cuz it's most likely, that it would return a new pointer instead
This is why I don't believe the C developers that claim they're "seeing the real costs" and that C "isn't hiding anything" - your libraries are still hiding plenty. Your compiler optimizer is rewriting everything you write. Even writing assembly, the ops you have are all microcoded and pipelined and speculatively executed - there's no *real* bare metal anymore (maybe the FPGA people?), so you're just picking the language you like most, just like everyone else. Which is fine, just don't get smug about it!
@@SimonBuchanNzHi, FPGA Person here. Just wanted to point out that the synthesis tools do most of the heavy lifting, optimizations, dead code elimination, hardware mapping, timing analysis....
But I mean, you can kick down this can as far as you want. You will always get closer, c is closer than c++, assembly closer than c, and so on...
@@CheaterCodes I assume you *can* pick and place gates individually if you're an idiot, though?
I've barely looked at hardware design stuff - Verilog and VHDL seem totally crazy to me - so I still think of you lot as "the real space wizards".
Considering the non-stop pace of hardware security vulnerability reports, perhaps I should temper my idolatry 😄
What exactly do you mean with "can't be reallocated"? do you simply mean you can't call realloc on that pointer? why would you? What you would want is to instead call realloc on a pointer to the data which that object contains, and update any members which need to be changed to keep the object in a valid state. To make this less error prone you abstract that into functions. In the end it does the same you can do in c just in a safer package (e.g. when naturally using the object you won't be able to realloc the data and then forget to update its length variable). The problem in the video also still happens in c when you simply use realloc in a function you hand a T** or a pointer to a struct to and then call that function in a loop, or use realloc in a loop directly. Just C++ and rust do the speed up of amortized costs automatically when using something like std::vector while in C you'd need to use libraries that do it or write it yourself.
It's a good point, but why insert such small array into the spline? Can't you just run the loop and store the items in a new array before sending to spline, or just use the addpoint multiple times instead?
What is the difference between reserve_exact and with_capacity, in rust?
Vec::with_capacity creates a new Vec, whereas .reserve_exact operates on an existing Vec. Vec::with_capacity(n) is equivalent to:
{
let mut v = Vec::new();
v.reserve_exact(n);
v
}
Edit: actually, Vec's docs are inconsistent about whether with_capacity gives you exactly the requested capacity or whether it might give you a little more. Best way to know for sure would be to dive in and read the implementation!
3:47 You can get the pointer of a statically defined array, if the size is known you don’t have to dynamically allocate it
The method requires a TArray which will dynamically allocate the storage for the elements.
lets say the memory increases by a factor of 2. And you want to add 100 points 100 times (all different) which method would be faster?
Awesome video!
I just love when large C++ projects implements their own data structures of which there are ones provided by the language, making it impossible to get any benefits from optimizations. I understand it can be a compatibility thing, but Rust makes it much easier to avoid this problem; need an iterator that conforms to other iterators in the language? implement the traits. This makes it so easy to integrate different libraries together with the language without having to work with 500 different types and conversions in your main logic.
I do generally agree with you. Though, the thing that bugs me most about TArray isn't its mere existence--it's defensible that they would want to take full control over their vector implementation and tune it to their exact needs--it's the fact that the API is very different from std::vector. Methods are capitalized, things have different names (e.g. .resize() is called .SetNum()), iteration is kind of strange. It puts a needless barrier, both mentally and syntactically (which is annoying for e.g. templates), in the way of interop. Other libraries like Boost.Container that provide container alternatives do so while trying to mimic the standard API as closely as reasonable, which is a much better approach IMO.
In Unreal's defense, though, as a codebase it literally predates the C++ standard. There was truly no standard library to use when the project was started. I recall a forum post where Tim Sweeney explains that the reason they rolled their own containers was largely historical and due to this fact. In that same post he mentioned that it might be a reasonable medium- to long-term goal to migrate UE to use the standard containers.
Was nice and informative.
Thanks
You know, all of this could be handled reasonably by putting the reserve inside an if statement, where it will only run reserve if it the target number is at least a constant factor (typically 2) larger than the current len. If it is lower than that, either you would already have enough space preallocated, or you would run a similarly expensive reallocation just once, as if you had run reserve. This approach mostly gives you the best of both worlds, without the risks talked about here, and it is super easy to implement.
Yeah probably an oversight since they imagined that people would be using the insert_multiple() type function to do a true bulk insert and not just inserting small arrays incrementally. So I really can't blame the developers just one of those weird use cases they probably never envisioned happening.
It might be worth a little bit of constant overhead to allocate *at least* a factor of the current vec capacity, but if the numbers of objects you're adding is more than that factor then allocate some extra in the same allocation. Also just exposing a reserve function for the collection in the API could be useful
I liked this video quite a bit, but wasn't a huge fan of interchangeably using 'geometric' and 'exponential' as they are different rates of growth. Specifically, what we're most familiar with is geometric growth, that being a^n. However, sometimes you'll see algorithms with truly exponential growth, which are n^a. They go back and forth, but for greater values of n, geometric complexity is proportionally greater than exponential complexity.
geometric growth is not n^a?
@@user-dh8oi2mk4f geometric growth is n^a, yes. e.g. x^2, x^3, or n^2, n^3, whatever
@@stanstrum then why are geometric sequences defined as ar^n?
@@user-dh8oi2mk4f Oh, I might have it backwards. It's been a while since I've used either of them but you get what I mean.
examples of how to get geometric growth back onto reserve:
template
void reserve_geometric(std::vector& vec, size_t n){
vec.reserve(std::max(n, vec.capacity() * 2));
}
template
void reserve_geometric(std::vector& vec_to, std::vector& vec_from){
vec_to.reserve(std::max(vec_to.size() + vec_from.size(), vec_to.capacity() * 2));
}
I'm not saying it's better than the insert method shown in the video. I'm just giving example of how trivial it is to e.g. modify the reserve line in Unity AddPoints() to get this behaviour, if you want it. It's probably better than the method in the video which uses .resize() to grow geometrically because .resize() default constructs all the elements which might be expensive, depending on the type.