@@LowLevelTV lll what do i do i accidentaly started rewriting c++ stl and physically cannot stop every time i go onto my computer my hand starts searching for implementation :(((
Writting a memory allocator is standard part of learning C. People will not appreciate how complicated it is unless they have tried making one themselves
Well, the problem is that malloc is slow not since it's implementation is bad but since it's designed to work as a general purpose allocator. One probably will not beat the years and years of programming art that flew into implementing such a general purpose allocator, by creating an identical clone. Performance lies in specialized allocators that can be way faster in some corner cases by using domain knowledge about the user code, which a general allocator like malloc doesn't have. Much like when Dx12/vulkan was released and devs tried to reimplement Dx11 style drivers which had like 10+ years of optimization wizardry in them already. No way to beat this. Rethinking the approach to a more domain centric solution was what gained performance benefits in the end.
Some of my coursemates were writing garbage collectors as a mandatory thing. Even hopelessly dumb at programming managed to pass that somehow. Comparing that to today's tech screenings... kids asking childish questions pretending to be rocket scientists. Able to use a standard library hash map, what an achievement.
"Is it faster? No. Is it more efficient? No." It's always fun--and honestly the best way--to learn by writing things that just don't make any sense. Like using C to make something slower and less efficient. But you are 100% and objectively correct when you say all that matters is that you learned something. Most of the best discoveries humanity has made come from people just trying stuff. Getting stuck into only doing what is "correct" is a box that's identical to just not doing anything at all. Do things The Wrong Way™ more often, and you'll be shocked at how much you grow as thinker and problem solver. You completely violated the entire reason we still choose C to write things--speed and efficiency--but made fantastic content and learned cool stuff doing it. And I learned cool stuff from you. As a professional teacher, I fucking love this kind of stuff. I wish I could convince my students to go offscript and try this kind of stuff so that they actually learn things instead of just memorizing and creating dogma. What a fantastic idea, honestly.
Think of calling malloc like going to the store to get food, and having your own allocator like going to the fridge. You don't want to use malloc every single time you want to "eat food", that would take so much time to travel back and forth. Instead you want to "meal plan", occasionally go to the store and stock up your "fridge" with memory you think you'll need, and then when you get "hungry" you pull some out to "eat", already conveniently accessible to consume. Where the analogy breaks down is giving the food back to the store when you're done with it. Maybe a video rental store is a better analogy, but no one knows what those are anymore.
I recently had to do this for my bachelors degree and I am so f*ing proud of what i did. I found writing some function like malloc on my own really hard, but finding out things about the opperating system and communicating with it even more fun and interesting. Great video, took me back to my own struggles.
I had to do this in University. It was a good time. Credit was awarded according to performance metrics, so if you wanted full points, you had to make a sophisticated allocator.
@@MistahHeffo I mean, yeah. We had to implement the malloc interface, just like in the video. The sophisticated part was making an efficient allocator underneath that interface by using effective data structures and optimizations for the task--an explicit free list with coalescing was sufficient to get full points, but you could potentially get extra credit with more effort.
@harrisonfackrell I meant that if you implemented a Garbage Collected Heap Allocator in C that was absolutely flawless, you would have been flunked on ideological grounds alone
Man that looks interesting! Do you have the code uploaded somewhere on the web? Like Github or any other code sharing platform. I'd love to take a look at your code.
Haha, I actually got the idea in my head to try to make memory allocation in C easier too, but just doing so by obfuscating malloc() behind another function. I think you had a lot better of an approach 😂
Couple of hours actually. But I used sys_brk instead of mmap. Which allows me to actually extend my heap past the allocated initial value if my kernel of course allows me 😢
i'm a cs student in germany, and we are currently, in one module, programming a little multitasking OS for an atmega 644. With own memory and heap drivers, aswell as malloc with different mem alloc strats, free, realloc and such things. It is really nice and one learns so much about low level programming with C!
I love how I just did this a couple of weeks ago as a hw for my intro to computer systems class (CMU). We did malloc using a 8 bit header and we implemented coalescing as well. Honestly I just wanted to add this cause it was great and I really do recommend everyone try to do it as a project if you want to learn about heaps, malloc or just practice some C programming. Less than 2000 lines should do it all.
Funnily enough, I programmed my own malloc and free a week ago; I went with an array of bytes (unsigned char) with a predefined size, and implemented a doubly-linked list with the data being in-line with the heap object info by using an empty struct at the end of the heap object info definition as the data marker. I didn't need an 'in use' boolean because I used the heap object's size as an indicator, 0 meaning there was no heap object there.
Two or three years ago I wrote a non-contiguous memory allocator in rust. It was a fun little project, though it was actually for my work. I ended up learning a ton about memory security and how memory works on the lowest level. If you added canary pointers and zeroed out your heap memory before dropping it, you would be most of the way to the security of something like libsodium. They also have memory guards and they use the kernel memory permission calls to lock memory, but something I learned from an audit of my system is that locking memory actually makes it more potentially exploitable. With memory locking, you basically attach a metadata flag to the memory chunk which the system looks at before it attempts to read or write to that chunk. If a malicious actor forced a system coredump of the memory, they would still be able to see and what was inside of the memory chunk and most importantly, it would have that metadata flag that shows that it was locked. In other words, the malicious actor could use the metadata flag to find where the most secure data was in the core dump logs. There are ways around this of course, you can prevent chunks and memory from being dumped, though we did something completely different for the non-contiguous memory allocator. Effectively what we did is we used a bunch of xors and hashes to split the secure data into parts and put it all over a large area in the memory space. And then when you went to retrieve the secure data, the allocator would take all of the discrete chunks and xor them together to get back the secure data and a hash of the secure data.
Yep. You basically created a memory pool. Went through all of this and and whole lot more with the dynamic memory pool component of my (non bare metal) OS. Great video 😁👍
Out of curiosity, why use mmap() to allocate memory? The kernel has an actual syscall for handling heap allocation to the process, brk(), and libc usually has the sbrk() wrapper to make things a little easier. This a good learning exercise though to learn about the black magic that is heap management and all the subtle nuances that go along with it.
This is all nitpicking, the real issue, which is unfixable if you want "your own" malloc to be perfect, is portability; outside Linux these are not a thing! Outside POSIX mmap() is also not a thing (don't say "Cygwin", that's pure hackery!).
@@erikkonstas brk() and sbrk() are available on nearly all Unix and Unix-like OSes, so its quite easy to make a malloc() implementation that is portable within the Unix realm. While Windows doesn't support brk(), it does have a similar system call, and its not all that hard to structure your code to be able to be portable between the two. Is Cygwin even relevant anymore considering WSL?
I got a very "primitive" model of my heap allocator working. However, I'm not sure how to properly "classify" or describe it, since it works exclusively with a particular type of "meta-object" that actually handles the "span" of the allocation... Yes yes, that means that it's more of a frame-work than a run-of-the-mill allocator. But! It doesn't leave any holes in the heap... in fact it barely moves anything on the heap much at all whilst also being able to resize existing spans of memory "in-place"... it's probably the most illegal c code you've ever seen tbh, but you have inspired me "to boldly go where behavior has yet to be defined" 😎
I do really appreciate the approach: you’re not reinventing the wheel to beat what tens of years of development lead us all to. You are just learning by doing, and I support that very much! For the sake of educational purposes, though, I would suggest a follow-up with garbage collection, added quite easily by replacing the inuse variable with an int index! You can then use that whole thing as a basement for a discussion about race conditions and so on. Keep it running 🤟🏼
{Just realized this is a relatively old video, and likely going to go unread by folks, but Ill leave it here anyway} Making an allocator is definitely a fun project for low-level folks. I would put it at a similar place as 'building a compiler'. Not technically needed, but very very useful to do. tcmalloc used to be king of the hill. Unsure what is now. What a lot of projects would do is basically get a 4k page for anything smaller than 4k. Those 4k pages would be split into subchunks of varying sizes, the smallest being 4 bytes. You can then use things like bitmaps for knowing where those smaller memory pieces are. It was more complex than that, but roughly, that was how quite a few games would build their allocators. So, memory allocation is then basically: void * malloc( size ) is size > 4k? Just allocate and return (this can get more complex too where you may want to keep some memory for yourself and just reuse it. There are time costs getting memory from the OS) find page with chunks next_pow_2( size ). if none, make one find free spot in chunk allocation, and return it. Another thing you can do if you write your own, and you have a particularly horrid memory overwrite bug, is to do that thing you mentioned about not doing and go ahead and allocate a full 4k page per allocation. This definitely wastes a TON of memory, but not enough that it wasnt useful to track down some bugs. Also, because of the MMU and the huge address space thats available (48bit nowadays I believe?) you have enough "address" space to place all these pages. So, you allocate the 4k page, then place your entity either at the start or the end of the page. You then mark the 2 pages around your page as invalid / cant read /cant write, then from there the OS will sigfault, which is easy to catch and hopefully give the exact code (or nearby) that is causing the issue.
I love this video, brought me back to High School where the Systems teacher had us write an allocator and a scheduler. It also came useful since I actually had to write a custom allocator several years later for a job.
I had to write a custom heap allocator in my OS/System Programming course. The only really involved C project we had written (in a prereq) was a simple CRUD driver, we were given 2 weeks to write it alone (heavy collaboration penalties), *and our grade was entirely based on cycle and memory efficiency relative to the standard C implementation* I scrapped my entire codebase twice in the process and after several programming fugue states driven by dangerous levels of caffeine consumption I ended up with an 85%
This is a "fun" assignment we all do in CS as part of learning about the OS. I do wonder about the linked list though, not sure that's the best way to go about it. From what i recall in my courses we just used chunk indices and saved the next free index in the memory header
First! 😍 Oh yeah... 3:00 - lol - every time I write malloc(), in my head I sing Monzy's So Much Drama in the PhD where he goes "I ain't never callin' malloc without callin' free."
this was quite interesting to listen to, always wondered if anyone from the "new schools of thought" actually did any of the old-school stuff....I've probably made at least a dozen of memory allocation routines. some advice, double-linked lists work better along with storing the table information directly into the heap, as this would redirect the cache for use at that location for when the requestor decides to actually use it. The "structure" I used had 4 variables (next, prev, flag, size), and you can recast a pointer to that structure/class to get the data at the heap location. The flag contains an ID mark (to ensure that you are freeing a valid pointer) along with other use case checks for various things (depending on if you have "shielded" or contained protection, or if the data contains multiple location paths for cluster storage....etc. etc., very advanced stuff). Aside from that, you only need maybe 5 other pointers in a static position, the heap pointer itself (to free when the program unloads), the first/last of used memory (init to null) and the first/last of free memory (init to heap pointer). When allocating, the return should be a pointer adjusted after the structure/class, and when freeing, subtract that size to get the true location. "size" of the data used includes the struct/class as well, makes it easier to coalesce calculate later (if true pointer plus size equals next free pointer, then it can be combined). why double-link? it literally is much faster to assign and unassign, and defrag comes along with it. Plus, you could have "solid allocate" functions where the requested space may lie dormant for awhile, can be allocated from the back. It's also not the difficult to add in the ability for multiple heap connected spaces, as link pointers can jump memory bounds.
funny that this would come out when I've been painstakingly writing my own dynamic memory allocator in assembly the last couple days for my homebrew system
always enjoy these videos, I used to program exclusively in C back in 1990's and now I get to have a retro back to the future blast each week! cheers mate and happy christmas!
I'm either a GigaChad or a mega-dumbass. I replaced malloc with a hierarchy of allocators, like an OS kernel. In the context of a game engine in C, freeing entire allocators is simpler than micromanagement of individual allocations. A SLAB (SLOB, actually) allocator is used to process configuration files without calling malloc for every single parsed token. I free pages by walking the list of used, free and partially filled caches and letting pointers go out of scope. There's an allocator for allocators that uses a tree with a headerless scheme. It's basically for subdividing pages and merging on free. Fiber stacks, and allocators are clients. You don't want 4072 bytes of fragmentation when using guard pages, or to put allocator metadata in a read-only page or a guard page. Predecessor and successor nodes are the candidates for merging. This is probably going to bite me. I don't know how yet.
I did this in 1999, but this was for debugging purposes only. My alloc basically wrapped GNU's alloc. I did this to prove that my code didn't leak memory. Turns out, after a few iterations, it didn't. So the tool was successful. My code, which I was trying to harden, also didn't use stupid memory allocation like strdup(). I was basically working with pools/arenas already. My code also didn't make much use of raw pointers. Instead, I relied on unsigned integers as indexes into a block of memory.
Instead of creating custom alloc/free functions, there is an alternative method for integrating a malloc variant into application code is to implement malloc and free. You can compile the allocator library into a shared object, and set the LD_PRELOAD environment variable to point to that file. Then, your custom malloc/free code will be used in-place of the default implementations whenever you're running an application.
In order to make a performing algorithm you typically base it off a static allocated pool of memory with a different base structure to allocate, deallocate and remerge regions from that very pool... Any system calls to dynamically allocate "behind the scenes" is basically just programming a facade which in turn might be even slower than calling the functions directly. I did many implementations for Microcontrollers and Windows in C and C++ ... If someone wonders "Why for Windows?!", because if you have a ton of alloc and deallocs, especially of smaller buffers, in a short amount of time, you might get randomly greeted by obscure exceptions which basically are caused by an awful memory fragmentation. Newer Windows versions are a bit more robust, but they still have that issue.
Game engine darkplaces (Nexuiz and others) uses own memory allocator which is a wrapper for a malloc and free. Its allocate bit more and wrote additional info - from where, why and which group - which is useful for debugging - not only for mem leaks.
"Imagine that a user allocated exclusively 32 byte fragments.. and now pretend for a second that they freed them" Now this is pushing the boundaries of credibility...
Could be worth revisiting this project and seeing what other APIs you could provide that makes allocation easier to deal with. Could be simple things like seeing how the implementation is affecting by requiring free to pass a size as well, or more complex things like looking at arena allocators and how you can use pass-by-value semantics to make the program automatically free memory when you return from a function without the need for a GC or any use of the free function.
Oh man.. this reminded me of our OS course in uni. We made a visualization of this in Java, complete with compaction, coalescing, and automatically freeing the allocated memory once the process is done. It's just a visualization tho, it didn't actually allocate heap (except for when we create new objects, but that's just a technicality).
I remember creating a slab allocator for a stub OS was one of the projects one could choose for (part) of the exam of Operative Systems. While It seemed cool i was more caught up on signals, but I sort of remember the way it was done, so this rings a bell
My personal anecdote about "reinventing the wheel" (but learning important stuff along the way) was building a simple 3D model editor (for modding a game) with zero access to an actual 3D library API. I learned about coordinate transformations, surface normal calculation and cull-facing, trianglefans vs. trianglestrips, and _so much more._
Can you make a more indepth video about dynamic memory allocation? Explaining how glibc's malloc work, other alternatives, how does it differ from other systems languages, like rust and zig... Can you have different heaps? Etc
Kinda relatable, I was making a my own small programmable tool that takes instructions from a file and makes processes to do those,the only issue was that I was duplicating the memory allocated for the content of the file everytime I forked so I made 6 functions , three for input and 3 for cleaning and and detaching, I used the shmget api
An easy way to combat fragmentation is to use a buddy allocator (or slab allocator) strategy. It has pros and cons, but dealing with fragmentation sucks.
We had to write our own malloc as part of our first "real" computer science class at Georgia Tech. It was the only assignment I submitted that I knew didn't work. I still somehow passed. Despite not working in Ubuntu, it somehow worked in Windows. All of our assignments were supposed to be graded against an Ubuntu image distributed by VMware, but my TA must have been a little lazy or overwhelmed that day and graded on his own laptop.
Make 3 free lists, one each for smol, mid and large chunks. Just select the smallest one that is still larger than what the user requested. No fragmentation, no merging, different just because, maybe faster, can be tuned to your application.
I had been starting to wonder if memory allocators did defragmenting. That stuff was only ever mentioned to me when talking about garbage collectors, and I was really starting to wonder if this was a consideration
Not in the same way. You talk about defragmentation in GC that uses compaction, generally by copying all allocations that still exist to a new space without the gaps, then updating all the references to the old locations. Without a runtime you can't do the latter step, so the best you can do is try to allocate things in such a way that deallocating isn't likely to mess things up. One simple and effective approach is to keep all allocations that are pretty close in size together, say, all the 8 byte allocations, 16, 24, 32, etc... Then no matter what order things are deallocated you still have the same size available.
I wrote my custom allocator recently. The trick I used was to only allocate chunks of sizes of powers of 2. I also had them clustered that way. Finding the right free chunk is trivial then because it can always be larger than what the user requested. Besides, should the user want to expand the chunk later on it would also be trivial as long as the requested additional memory doesn't make the chunk exceed the next power of 2. I was dealing with array buffers that needed to automatically grow when they get full. Since usually you would double the size every time the buffer gets full it makes a whole lot of sense to only allocate chunk sizes of the powers of 2.
this is interesting because I had learned (apparently erroneously) that C doesn't have a heap and that malloc just got arbitrary chunks of memory from the O/S. it's a bit confusing because sources will say both that C has no heap but also that malloc gets its memory "from the heap". I guess what it is is that in C "the heap" is specifically managed by malloc and not the compiler whereas in C++ it is managed by the compiler (I guess).
So obviously as a wheel reshaper I also did (start making) my own memory allocator with blackjack and hookers for c++. The blocks were separated into three main types of structures, one for big allocations, which is simple, second for small allocations, sizes of power of two fixed for that specific chunk, free slots encoded into binary ones and zeroes which were looked up by mmx/avx least significant bit operations. Last type is free chunk, each making itself into buckets and multi node binary trees(again mmx+ operations) ordered by size to make allocating a sufficient sized chunk quick. It was looking to be real fast in its half complete form already, like in the order of millions of allocations and deallocations, but then of course main original objective was to make soft deallocations too, which would only destruct the objects when space was required. Idea being that you don't necessarily want to destroy, for example the texture data you worked hard on loading into memory, just flagging it for recycle, maybe reclaiming it later. Making this whole thing work was annoying and getting too hacky with new and delete operators, destructors and constructors, plus started considering concurrency problems and just ended up not completing the thing, ending up on the forgotten projects shelf.
I wrote a program that replaces malloc in some games with a mimalloc (a better memory allocation library) in some games (Factorio and Stellaris but should work with anything on thats 1. On Windows 2. 64Bit 3. Statically links to the same standart library (or any version that has the same malloc implementation). My program works by injecting a dll and then doing low level stuff to look for byte signature of function like malloc, free , ... then I just replace the first instruction in those function with a jump to mimalloc version. This change of memory allocator alone gives Factorio and Stellaris up to 20% more performance (If Large Memory Pages are also enabled without it it's more like 1-5% or something) which clearly shows how bad malloc is and how much using something better can help.
Or you could just have pre-allocated all the memory you will ever need and use that large chunk any which way you like. There is always a complicated solution to a simple problem. :-)
Huh I remember doing this back at Uni but was using sbrk instead of mmap. I wonder what the on the ground difference between the two is. My guess would be either "sbrk is implemented in terms of mmap these days" or "sbrk is a little faster due to being able to make more assumptions about the nature of memory". Also, you missed a final step after defrag. Having a single free list is fine but you get some nice perf boosts out of having multiple lists of different sizes so you don't have to scan through all the 8 byte blocks to get a 1kb block.
I remember looking through libmowgli's implementation of a custom allocator (mainly for checking how useful it would be for static memory allocation), and finding it to be quite impressive and relatively straightforward. I don't recall if it addresses the issues you bring up at the end of the video, but I wouldn't be surprised if it did.
I randomly had the idea to make a custom heap allocator few days back. Have been working on that idea for like a day. And boom youtube recommends me this video.
It also sounds like you didn't handle multiple threads allocating and deallocating chunks at the same time. The easiest way to do that would be to wrap a big mutex around your alloc and free functions, but that would be really slow.
It wouldn't be possible to get a link or something to the code written for this video, would it? I'd like to take a gander at it so I may learn some more C and coding practices. Thanks either way, very cool video and idea!
I believe standard library uses brk()/sbrk() system calls to increase heap size (but I believe there is only one break pointer per process, so you would have to disable the standard library's allocator).
It's still there for compatibility, but discouraged against. Apple deprecated those calls in macOS ("The brk and sbrk functions are historical curiosities left over from earlier days before the advent of virtual memory management."), and glibc on Linux only calls it once to allocate some memory for the allocator itself, leaving the rest to mmap.
You're not supposed to allocate in bigass chunks for small amounts of data because those chunks flood the processor cache with junk data and cause cache misses which dramatically slows performance when it's constantly happening (which it will because you'd be doing it on every alloc).
A good start, why not. Of course your coalescing does not protect from heap fragmentation: suppose the case when the caller frees every second chunk keeping the others for himself. There could be and actually are more complicated scenarios. Whatever you do or do not do, whatever tricky memory guessing strategies you employ, you end up with fragmented unusable heap, so you have to take more memory from the OS (to eventually spoil them too), and you end up with memory leaks. Java fights that using indirect addressing and costly memory degragmentation in GC, which does not fully address the problem, just defers the inevitable program crash and restart. The same thing happens to other resources in one or another form. That's how modern software works.
this statement is false. the inevitability of a program crash you mention is not a given. proper memory management can mitigate the effects of fragmentation. also, modern operating systems and programming languages have evolved to handle these scenarios more gracefully. while extreme fragmentation and poor memory management can lead to crashes and restarts, it's not an inherent characteristic of modern software. programs don't necessarily have to crash due to memory issues. they might run slower or become inefficient, but a well-designed program with robust error handling and memory management strategies can continue to function, albeit perhaps with reduced performance, in the face of memory fragmentation. so shut up
Try breaking up your memory chunk into separate pools where all of the allocations in a given pool are the same size. Round up an incoming memory request to the next largest chunk size you have a pool for. Then you won't have internal fragmentation because all of the allocs in a given pool are the same size... at the cost of wasting memory on the rounds ups.
This looks pretty neat. I'm considering making a small allocator in rust because rust doesn't let you pick the size of a new array (not a Vec) at runtime. Hopefully it won't be too complicated
Writing a really performant memory allocator is much more complex. Check the code of mimalloc, jemalloc or TCMalloc, which basically all internally work the same way.
I did something similar but it had a bit more to it than that and was hell to get to work, but when it worked it worked well. As I recall it was the debugging that was the trouble. it would screw up, but only once in a blue moon.
Have you tried a memory arena/area allocator? Any time that memory can be allocated in the arena, it brings more benefits than you could imagine (okay, you could imagine it). No leaks, no per-object dealloc required, no calls to malloc... It's like writing in Python, but it's in C.
wanna get good at programming? check out lowlevel.academy and use code THREADS20 for 20% off lifetime access. or dont. im not a cop
Hm, arent you a cryber-secrerty specealist? kinda like a cop...
virgin standard librarian vs based and gigachad wheel reinventor
bro 💀
my wheel is rounder
@@LowLevelTV lll what do i do i accidentaly started rewriting c++ stl and physically cannot stop
every time i go onto my computer my hand starts searching for implementation :(((
Cabbage 🥬
@@LowLevelTV and most definitely squeakier.
"This thing sucks!I can make it better.", "It still sucks but it's mine". Man... I feel right at home
That's me too, it sucks, but I know why it sucks.
Writting a memory allocator is standard part of learning C. People will not appreciate how complicated it is unless they have tried making one themselves
I keep making horrible web servers in every language I learn. They are absolute trash and I'm not even getting better 😭.
Been there. Many, what would be the theoretical idea for freeing memory with the minimum loss of performance?
Well, the problem is that malloc is slow not since it's implementation is bad but since it's designed to work as a general purpose allocator. One probably will not beat the years and years of programming art that flew into implementing such a general purpose allocator, by creating an identical clone. Performance lies in specialized allocators that can be way faster in some corner cases by using domain knowledge about the user code, which a general allocator like malloc doesn't have. Much like when Dx12/vulkan was released and devs tried to reimplement Dx11 style drivers which had like 10+ years of optimization wizardry in them already. No way to beat this. Rethinking the approach to a more domain centric solution was what gained performance benefits in the end.
im waiting for "intel sucks so i forged my own processor using raw silicon"
The story of AMD basically
Also there is guy, who did it in his garage 🤷♂️
@@MrMassaraksh calling that a garage is a pretty stretch my uni doesn't have the tools that guy had
I think a better concept is: "the x86 ISA is a real dog's breakfast, so I invented a whole 'nuther architecture."
@@christopheroliver148 RISC-V origin story
I just love the idea of working in a professional workflow, trying to grab some memory and just getting, "nah", for my trouble.
Needs the audio, too 😂
@@alvaronaranjo2589 One of the hacker moments of all time...
I once got the error message "Way too long, dude" from a first-party macOS application.
your ability to make C programming vids entertaining and funny is pretty amazing tbh.
When you realize C is both of those things.
in my university creating your own allocator is a mandatory project on third semester! its nice to see someone actually doing it for fun :D
Some of my coursemates were writing garbage collectors as a mandatory thing. Even hopelessly dumb at programming managed to pass that somehow. Comparing that to today's tech screenings... kids asking childish questions pretending to be rocket scientists. Able to use a standard library hash map, what an achievement.
@@InconspicuousChap An unfortunate number of colleges and universities teach programming and call it computer science.
you from feri?
holy shit where are you going university of wizards??
@@masterbaits4108 Georgia Tech.
"Is it faster? No. Is it more efficient? No." It's always fun--and honestly the best way--to learn by writing things that just don't make any sense. Like using C to make something slower and less efficient. But you are 100% and objectively correct when you say all that matters is that you learned something. Most of the best discoveries humanity has made come from people just trying stuff. Getting stuck into only doing what is "correct" is a box that's identical to just not doing anything at all. Do things The Wrong Way™ more often, and you'll be shocked at how much you grow as thinker and problem solver. You completely violated the entire reason we still choose C to write things--speed and efficiency--but made fantastic content and learned cool stuff doing it. And I learned cool stuff from you. As a professional teacher, I fucking love this kind of stuff. I wish I could convince my students to go offscript and try this kind of stuff so that they actually learn things instead of just memorizing and creating dogma. What a fantastic idea, honestly.
you should become a teacher!
Think of calling malloc like going to the store to get food, and having your own allocator like going to the fridge.
You don't want to use malloc every single time you want to "eat food", that would take so much time to travel back and forth. Instead you want to "meal plan", occasionally go to the store and stock up your "fridge" with memory you think you'll need, and then when you get "hungry" you pull some out to "eat", already conveniently accessible to consume.
Where the analogy breaks down is giving the food back to the store when you're done with it. Maybe a video rental store is a better analogy, but no one knows what those are anymore.
Thanks for the advice, but i already had dinner.
@@reed6514 instructions unclear, I've eaten an entire store worth of VHS tapes.
"Maybe a video rental store is a better analogy, but no one knows what those are anymore."
How about car rental?
a library?
you mean mmap right? both glibc and musl implementations of malloc have their own 'fridges' in your analogy
I recently had to do this for my bachelors degree and I am so f*ing proud of what i did. I found writing some function like malloc on my own really hard, but finding out things about the opperating system and communicating with it even more fun and interesting. Great video, took me back to my own struggles.
I had to do this in University. It was a good time. Credit was awarded according to performance metrics, so if you wanted full points, you had to make a sophisticated allocator.
if you were learning C, and threw in a Garbage Collector you probably would have been flunked.
@@MistahHeffo I mean, yeah. We had to implement the malloc interface, just like in the video.
The sophisticated part was making an efficient allocator underneath that interface by using effective data structures and optimizations for the task--an explicit free list with coalescing was sufficient to get full points, but you could potentially get extra credit with more effort.
@harrisonfackrell I meant that if you implemented a Garbage Collected Heap Allocator in C that was absolutely flawless, you would have been flunked on ideological grounds alone
Man that looks interesting! Do you have the code uploaded somewhere on the web? Like Github or any other code sharing platform. I'd love to take a look at your code.
In this case it’s more efficient to not set the free pointer so close to the end of the allocated space. Something to do with polynomials
If I try to write a heap someday, I am going to name it Heap Of Garbage.
Haha, I actually got the idea in my head to try to make memory allocation in C easier too, but just doing so by obfuscating malloc() behind another function. I think you had a lot better of an approach 😂
Creating your own heap allocator. What an absolute Chad! 💪🏻
Wow I literally made a malloc yesterday in x86 assembly and today I see you upload your own malloc. The universe is connected 🧠
Yesterday, as in you did it in a day??
Yes
Couple of hours actually. But I used sys_brk instead of mmap. Which allows me to actually extend my heap past the allocated initial value if my kernel of course allows me 😢
Can we see your code for reference
< yes here @@babudelhi9885
i'm a cs student in germany, and we are currently, in one module, programming a little multitasking OS for an atmega 644. With own memory and heap drivers, aswell as malloc with different mem alloc strats, free, realloc and such things. It is really nice and one learns so much about low level programming with C!
I love how I just did this a couple of weeks ago as a hw for my intro to computer systems class (CMU). We did malloc using a 8 bit header and we implemented coalescing as well. Honestly I just wanted to add this cause it was great and I really do recommend everyone try to do it as a project if you want to learn about heaps, malloc or just practice some C programming. Less than 2000 lines should do it all.
Funnily enough, I programmed my own malloc and free a week ago; I went with an array of bytes (unsigned char) with a predefined size, and implemented a doubly-linked list with the data being in-line with the heap object info by using an empty struct at the end of the heap object info definition as the data marker. I didn't need an 'in use' boolean because I used the heap object's size as an indicator, 0 meaning there was no heap object there.
Two or three years ago I wrote a non-contiguous memory allocator in rust. It was a fun little project, though it was actually for my work. I ended up learning a ton about memory security and how memory works on the lowest level. If you added canary pointers and zeroed out your heap memory before dropping it, you would be most of the way to the security of something like libsodium. They also have memory guards and they use the kernel memory permission calls to lock memory, but something I learned from an audit of my system is that locking memory actually makes it more potentially exploitable. With memory locking, you basically attach a metadata flag to the memory chunk which the system looks at before it attempts to read or write to that chunk. If a malicious actor forced a system coredump of the memory, they would still be able to see and what was inside of the memory chunk and most importantly, it would have that metadata flag that shows that it was locked. In other words, the malicious actor could use the metadata flag to find where the most secure data was in the core dump logs. There are ways around this of course, you can prevent chunks and memory from being dumped, though we did something completely different for the non-contiguous memory allocator. Effectively what we did is we used a bunch of xors and hashes to split the secure data into parts and put it all over a large area in the memory space. And then when you went to retrieve the secure data, the allocator would take all of the discrete chunks and xor them together to get back the secure data and a hash of the secure data.
Yep. You basically created a memory pool. Went through all of this and and whole lot more with the dynamic memory pool component of my (non bare metal) OS. Great video 😁👍
Out of curiosity, why use mmap() to allocate memory? The kernel has an actual syscall for handling heap allocation to the process, brk(), and libc usually has the sbrk() wrapper to make things a little easier. This a good learning exercise though to learn about the black magic that is heap management and all the subtle nuances that go along with it.
This is all nitpicking, the real issue, which is unfixable if you want "your own" malloc to be perfect, is portability; outside Linux these are not a thing! Outside POSIX mmap() is also not a thing (don't say "Cygwin", that's pure hackery!).
@@erikkonstas brk() and sbrk() are available on nearly all Unix and Unix-like OSes, so its quite easy to make a malloc() implementation that is portable within the Unix realm. While Windows doesn't support brk(), it does have a similar system call, and its not all that hard to structure your code to be able to be portable between the two. Is Cygwin even relevant anymore considering WSL?
Yeah, did that during a Bachelor module.
But without any (standard) library.
It was shitty, but worked.
he also did it without any standard library
@@human-ft3wk He uses the "mmap" function from the standard library.
@@reD_Bo0n gotta get memory somehow
@@UncleJemima Write your own wrapper for kernel interrupts (in assembler)
@@reD_Bo0n I mean, mmap is a system call, if you're going to avoid these you're basically rewriting the OS, which might be out of scope.
Had to do this for a lab in a computer systems programming course. That had to have been one of the most fun and informative labs I've ever done.
I got a very "primitive" model of my heap allocator working. However, I'm not sure how to properly "classify" or describe it, since it works exclusively with a particular type of "meta-object" that actually handles the "span" of the allocation... Yes yes, that means that it's more of a frame-work than a run-of-the-mill allocator. But! It doesn't leave any holes in the heap... in fact it barely moves anything on the heap much at all whilst also being able to resize existing spans of memory "in-place"... it's probably the most illegal c code you've ever seen tbh, but you have inspired me "to boldly go where behavior has yet to be defined" 😎
I do really appreciate the approach: you’re not reinventing the wheel to beat what tens of years of development lead us all to. You are just learning by doing, and I support that very much!
For the sake of educational purposes, though, I would suggest a follow-up with garbage collection, added quite easily by replacing the inuse variable with an int index!
You can then use that whole thing as a basement for a discussion about race conditions and so on.
Keep it running 🤟🏼
{Just realized this is a relatively old video, and likely going to go unread by folks, but Ill leave it here anyway}
Making an allocator is definitely a fun project for low-level folks. I would put it at a similar place as 'building a compiler'. Not technically needed, but very very useful to do. tcmalloc used to be king of the hill. Unsure what is now.
What a lot of projects would do is basically get a 4k page for anything smaller than 4k. Those 4k pages would be split into subchunks of varying sizes, the smallest being 4 bytes.
You can then use things like bitmaps for knowing where those smaller memory pieces are.
It was more complex than that, but roughly, that was how quite a few games would build their allocators.
So, memory allocation is then basically:
void * malloc( size )
is size > 4k? Just allocate and return (this can get more complex too where you may want to keep some memory for yourself and just reuse it. There are time costs getting memory from the OS)
find page with chunks next_pow_2( size ).
if none, make one
find free spot in chunk allocation, and return it.
Another thing you can do if you write your own, and you have a particularly horrid memory overwrite bug, is to do that thing you mentioned about not doing and go ahead and allocate a full 4k page per allocation. This definitely wastes a TON of memory, but not enough that it wasnt useful to track down some bugs. Also, because of the MMU and the huge address space thats available (48bit nowadays I believe?) you have enough "address" space to place all these pages.
So, you allocate the 4k page, then place your entity either at the start or the end of the page. You then mark the 2 pages around your page as invalid / cant read /cant write, then from there the OS will sigfault, which is easy to catch and hopefully give the exact code (or nearby) that is causing the issue.
i had to do this for a cs assignment in college, it’s honestly crazy how much shit gets abstracted behind the standard C libraries
I did that using assembly for a university project. I still have the scars.
I love this video, brought me back to High School where the Systems teacher had us write an allocator and a scheduler. It also came useful since I actually had to write a custom allocator several years later for a job.
For what job?
"We could truncate the chunk."
Chunkate, surely.
This channel really keeps me interested in learning c, thanks :)
Here is my version (much better):
my_alloc() {
malloc();
};
my_free() {
free();
};
Yeah, you'll just end up with memory leaks and crashes, but funny 😅
#define my_malloc malloc
#define my_free free
I had to write a custom heap allocator in my OS/System Programming course. The only really involved C project we had written (in a prereq) was a simple CRUD driver, we were given 2 weeks to write it alone (heavy collaboration penalties), *and our grade was entirely based on cycle and memory efficiency relative to the standard C implementation*
I scrapped my entire codebase twice in the process and after several programming fugue states driven by dangerous levels of caffeine consumption I ended up with an 85%
This is a "fun" assignment we all do in CS as part of learning about the OS. I do wonder about the linked list though, not sure that's the best way to go about it. From what i recall in my courses we just used chunk indices and saved the next free index in the memory header
I would not write my own malloc / free functions, but I have built my own "memory management method" upon them.
First! 😍 Oh yeah... 3:00 - lol - every time I write malloc(), in my head I sing Monzy's So Much Drama in the PhD where he goes "I ain't never callin' malloc without callin' free."
0:06 The essence of C programmers
this was quite interesting to listen to, always wondered if anyone from the "new schools of thought" actually did any of the old-school stuff....I've probably made at least a dozen of memory allocation routines.
some advice, double-linked lists work better along with storing the table information directly into the heap, as this would redirect the cache for use at that location for when the requestor decides to actually use it. The "structure" I used had 4 variables (next, prev, flag, size), and you can recast a pointer to that structure/class to get the data at the heap location. The flag contains an ID mark (to ensure that you are freeing a valid pointer) along with other use case checks for various things (depending on if you have "shielded" or contained protection, or if the data contains multiple location paths for cluster storage....etc. etc., very advanced stuff). Aside from that, you only need maybe 5 other pointers in a static position, the heap pointer itself (to free when the program unloads), the first/last of used memory (init to null) and the first/last of free memory (init to heap pointer). When allocating, the return should be a pointer adjusted after the structure/class, and when freeing, subtract that size to get the true location. "size" of the data used includes the struct/class as well, makes it easier to coalesce calculate later (if true pointer plus size equals next free pointer, then it can be combined).
why double-link? it literally is much faster to assign and unassign, and defrag comes along with it. Plus, you could have "solid allocate" functions where the requested space may lie dormant for awhile, can be allocated from the back. It's also not the difficult to add in the ability for multiple heap connected spaces, as link pointers can jump memory bounds.
We had a project in our school where we had to do the same thing but with realloc and calloc to which was quite fun
funny that this would come out when I've been painstakingly writing my own dynamic memory allocator in assembly the last couple days for my homebrew system
That's essentially what they asked us to do as homework for our OS dev course :D We had to implement it in Minix.
Next on the list. warning about potential memory leaks and maybe even a visualization of the current heap allocations. it's age, it's size etc.
I love how your videos are so short but packed with content ❤
always enjoy these videos, I used to program exclusively in C back in 1990's and now I get to have a retro back to the future blast each week! cheers mate and happy christmas!
this vid was incredible and also gave me instant anxiety from assignments i had to do like this in college in c
huh, I remember this guy from LinkedIn posts
I'm either a GigaChad or a mega-dumbass. I replaced malloc with a hierarchy of allocators, like an OS kernel. In the context of a game engine in C, freeing entire allocators is simpler than micromanagement of individual allocations.
A SLAB (SLOB, actually) allocator is used to process configuration files without calling malloc for every single parsed token. I free pages by walking the list of used, free and partially filled caches and letting pointers go out of scope.
There's an allocator for allocators that uses a tree with a headerless scheme. It's basically for subdividing pages and merging on free. Fiber stacks, and allocators are clients. You don't want 4072 bytes of fragmentation when using guard pages, or to put allocator metadata in a read-only page or a guard page. Predecessor and successor nodes are the candidates for merging.
This is probably going to bite me. I don't know how yet.
I did this in 1999, but this was for debugging purposes only. My alloc basically wrapped GNU's alloc. I did this to prove that my code didn't leak memory. Turns out, after a few iterations, it didn't. So the tool was successful.
My code, which I was trying to harden, also didn't use stupid memory allocation like strdup(). I was basically working with pools/arenas already. My code also didn't make much use of raw pointers. Instead, I relied on unsigned integers as indexes into a block of memory.
Instead of creating custom alloc/free functions, there is an alternative method for integrating a malloc variant into application code is to implement malloc and free. You can compile the allocator library into a shared object, and set the LD_PRELOAD environment variable to point to that file. Then, your custom malloc/free code will be used in-place of the default implementations whenever you're running an application.
Just wondering, what happened to the advent of code videos?
In order to make a performing algorithm you typically base it off a static allocated pool of memory with a different base structure to allocate, deallocate and remerge regions from that very pool... Any system calls to dynamically allocate "behind the scenes" is basically just programming a facade which in turn might be even slower than calling the functions directly.
I did many implementations for Microcontrollers and Windows in C and C++ ...
If someone wonders "Why for Windows?!", because if you have a ton of alloc and deallocs, especially of smaller buffers, in a short amount of time, you might get randomly greeted by obscure exceptions which basically are caused by an awful memory fragmentation. Newer Windows versions are a bit more robust, but they still have that issue.
Game engine darkplaces (Nexuiz and others) uses own memory allocator which is a wrapper for a malloc and free. Its allocate bit more and wrote additional info - from where, why and which group - which is useful for debugging - not only for mem leaks.
Gosh darn. That little “teehee” at the end is what got me to subscribe.
"Imagine that a user allocated exclusively 32 byte fragments.. and now pretend for a second that they freed them" Now this is pushing the boundaries of credibility...
Could be worth revisiting this project and seeing what other APIs you could provide that makes allocation easier to deal with.
Could be simple things like seeing how the implementation is affecting by requiring free to pass a size as well, or more complex things like looking at arena allocators and how you can use pass-by-value semantics to make the program automatically free memory when you return from a function without the need for a GC or any use of the free function.
Oh man.. this reminded me of our OS course in uni. We made a visualization of this in Java, complete with compaction, coalescing, and automatically freeing the allocated memory once the process is done. It's just a visualization tho, it didn't actually allocate heap (except for when we create new objects, but that's just a technicality).
I had this exact exercise!
I remember creating a slab allocator for a stub OS was one of the projects one could choose for (part) of the exam of Operative Systems. While It seemed cool i was more caught up on signals, but I sort of remember the way it was done, so this rings a bell
My personal anecdote about "reinventing the wheel" (but learning important stuff along the way) was building a simple 3D model editor (for modding a game) with zero access to an actual 3D library API. I learned about coordinate transformations, surface normal calculation and cull-facing, trianglefans vs. trianglestrips, and _so much more._
Can you make a more indepth video about dynamic memory allocation? Explaining how glibc's malloc work, other alternatives, how does it differ from other systems languages, like rust and zig... Can you have different heaps? Etc
Kinda relatable, I was making a my own small programmable tool that takes instructions from a file and makes processes to do those,the only issue was that I was duplicating the memory allocated for the content of the file everytime I forked so I made 6 functions , three for input and 3 for cleaning and and detaching, I used the shmget api
An easy way to combat fragmentation is to use a buddy allocator (or slab allocator) strategy. It has pros and cons, but dealing with fragmentation sucks.
We had to write our own malloc as part of our first "real" computer science class at Georgia Tech. It was the only assignment I submitted that I knew didn't work. I still somehow passed. Despite not working in Ubuntu, it somehow worked in Windows. All of our assignments were supposed to be graded against an Ubuntu image distributed by VMware, but my TA must have been a little lazy or overwhelmed that day and graded on his own laptop.
Make 3 free lists, one each for smol, mid and large chunks. Just select the smallest one that is still larger than what the user requested. No fragmentation, no merging, different just because, maybe faster, can be tuned to your application.
Exactly what I was hoping for. Thanks for the video. 🙏
I had been starting to wonder if memory allocators did defragmenting. That stuff was only ever mentioned to me when talking about garbage collectors, and I was really starting to wonder if this was a consideration
Not in the same way. You talk about defragmentation in GC that uses compaction, generally by copying all allocations that still exist to a new space without the gaps, then updating all the references to the old locations.
Without a runtime you can't do the latter step, so the best you can do is try to allocate things in such a way that deallocating isn't likely to mess things up.
One simple and effective approach is to keep all allocations that are pretty close in size together, say, all the 8 byte allocations, 16, 24, 32, etc... Then no matter what order things are deallocated you still have the same size available.
I wrote my custom allocator recently. The trick I used was to only allocate chunks of sizes of powers of 2. I also had them clustered that way. Finding the right free chunk is trivial then because it can always be larger than what the user requested. Besides, should the user want to expand the chunk later on it would also be trivial as long as the requested additional memory doesn't make the chunk exceed the next power of 2. I was dealing with array buffers that needed to automatically grow when they get full. Since usually you would double the size every time the buffer gets full it makes a whole lot of sense to only allocate chunk sizes of the powers of 2.
Cool probably not memory inefficient if you are not allocating in powers of 2
@@monsterhunter445 yes it wastes memory but since I develop games I always optimize for performance at the cost of memory.
2:25 "I don't know. I read it somewhere on Stack Overflow that that's bad" is a beautiful joke 💀
Yaaay!! That’s actually pretty cool! I did this as part of my final degree thesis in compuer engineering and it was pretty fun 😊😊 Nice vid as always!!
Great that you did this for fun but I had to do this as a project in a top 20 school
How the mmap worked btw , you have given a negative file descriptor for the call ? . This seems to be a nice implementation.
Subscribed keep it up my dude I'm going to school for a computer science degree and I gotta learn C so this was cool
this is interesting because I had learned (apparently erroneously) that C doesn't have a heap and that malloc just got arbitrary chunks of memory from the O/S. it's a bit confusing because sources will say both that C has no heap but also that malloc gets its memory "from the heap". I guess what it is is that in C "the heap" is specifically managed by malloc and not the compiler whereas in C++ it is managed by the compiler (I guess).
If only I had programming memes 15 years ago... I needed this my whole life.
This reminds me of the pain of trying to code a Texture Atlas from scratch in Open GL.
So obviously as a wheel reshaper I also did (start making) my own memory allocator with blackjack and hookers for c++. The blocks were separated into three main types of structures, one for big allocations, which is simple, second for small allocations, sizes of power of two fixed for that specific chunk, free slots encoded into binary ones and zeroes which were looked up by mmx/avx least significant bit operations. Last type is free chunk, each making itself into buckets and multi node binary trees(again mmx+ operations) ordered by size to make allocating a sufficient sized chunk quick. It was looking to be real fast in its half complete form already, like in the order of millions of allocations and deallocations, but then of course main original objective was to make soft deallocations too, which would only destruct the objects when space was required. Idea being that you don't necessarily want to destroy, for example the texture data you worked hard on loading into memory, just flagging it for recycle, maybe reclaiming it later. Making this whole thing work was annoying and getting too hacky with new and delete operators, destructors and constructors, plus started considering concurrency problems and just ended up not completing the thing, ending up on the forgotten projects shelf.
I wrote a program that replaces malloc in some games with a mimalloc (a better memory allocation library) in some games (Factorio and Stellaris but should work with anything on thats
1. On Windows
2. 64Bit
3. Statically links to the same standart library (or any version that has the same malloc implementation).
My program works by injecting a dll and then doing low level stuff to look for byte signature of function like malloc, free , ... then I just replace the first instruction in those function with a jump to mimalloc version.
This change of memory allocator alone gives Factorio and Stellaris up to 20% more performance (If Large Memory Pages are also enabled without it it's more like 1-5% or something) which clearly shows how bad malloc is and how much using something better can help.
Or you could just have pre-allocated all the memory you will ever need and use that large chunk any which way you like. There is always a complicated solution to a simple problem. :-)
I cant believe I did this back then. Also wrote a garbage collector because why not
Huh I remember doing this back at Uni but was using sbrk instead of mmap. I wonder what the on the ground difference between the two is. My guess would be either "sbrk is implemented in terms of mmap these days" or "sbrk is a little faster due to being able to make more assumptions about the nature of memory".
Also, you missed a final step after defrag. Having a single free list is fine but you get some nice perf boosts out of having multiple lists of different sizes so you don't have to scan through all the 8 byte blocks to get a 1kb block.
I remember looking through libmowgli's implementation of a custom allocator (mainly for checking how useful it would be for static memory allocation), and finding it to be quite impressive and relatively straightforward. I don't recall if it addresses the issues you bring up at the end of the video, but I wouldn't be surprised if it did.
I randomly had the idea to make a custom heap allocator few days back. Have been working on that idea for like a day. And boom youtube recommends me this video.
it was destiny
that is google spying on you,.
It also sounds like you didn't handle multiple threads allocating and deallocating chunks at the same time.
The easiest way to do that would be to wrap a big mutex around your alloc and free functions, but that would be really slow.
It wouldn't be possible to get a link or something to the code written for this video, would it? I'd like to take a gander at it so I may learn some more C and coding practices. Thanks either way, very cool video and idea!
0:48 was honestly expecting you to swing for sbrk
Would love longer video like this tbh ! 👍
I believe standard library uses brk()/sbrk() system calls to increase heap size (but I believe there is only one break pointer per process, so you would have to disable the standard library's allocator).
It's still there for compatibility, but discouraged against. Apple deprecated those calls in macOS ("The brk and sbrk functions are historical curiosities left over from earlier days before the advent of virtual memory management."), and glibc on Linux only calls it once to allocate some memory for the allocator itself, leaving the rest to mmap.
I love this video, so useful since I’m doing an OS class! I understand some of your words magic man!
Glad to hear it!
You're not supposed to allocate in bigass chunks for small amounts of data because those chunks flood the processor cache with junk data and cause cache misses which dramatically slows performance when it's constantly happening (which it will because you'd be doing it on every alloc).
A good start, why not. Of course your coalescing does not protect from heap fragmentation: suppose the case when the caller frees every second chunk keeping the others for himself. There could be and actually are more complicated scenarios. Whatever you do or do not do, whatever tricky memory guessing strategies you employ, you end up with fragmented unusable heap, so you have to take more memory from the OS (to eventually spoil them too), and you end up with memory leaks. Java fights that using indirect addressing and costly memory degragmentation in GC, which does not fully address the problem, just defers the inevitable program crash and restart. The same thing happens to other resources in one or another form. That's how modern software works.
this statement is false. the inevitability of a program crash you mention is not a given. proper memory management can mitigate the effects of fragmentation. also, modern operating systems and programming languages have evolved to handle these scenarios more gracefully. while extreme fragmentation and poor memory management can lead to crashes and restarts, it's not an inherent characteristic of modern software.
programs don't necessarily have to crash due to memory issues. they might run slower or become inefficient, but a well-designed program with robust error handling and memory management strategies can continue to function, albeit perhaps with reduced performance, in the face of memory fragmentation.
so shut up
@@LuXxenatorXOk. I'll better shut up indeed. I'm not good at talking to people whose understanding of a problem consists of primitive slogans.
Try breaking up your memory chunk into separate pools where all of the allocations in a given pool are the same size. Round up an incoming memory request to the next largest chunk size you have a pool for. Then you won't have internal fragmentation because all of the allocs in a given pool are the same size... at the cost of wasting memory on the rounds ups.
Also, consider using thread local storage so that each thread can have its own allocator (and set of pools)... this way you won't need mutexes.
Thinking quickly, Dave constructs a homemade megaphone, using only some string, a squirrel, and a megaphone.
This looks pretty neat. I'm considering making a small allocator in rust because rust doesn't let you pick the size of a new array (not a Vec) at runtime. Hopefully it won't be too complicated
Have you tried boxed slices?
I remember doing this when I was working through the C programming language book. Fun times
Writing a really performant memory allocator is much more complex. Check the code of mimalloc, jemalloc or TCMalloc, which basically all internally work the same way.
I did something similar but it had a bit more to it than that and was hell to get to work, but when it worked it worked well. As I recall it was the debugging that was the trouble. it would screw up, but only once in a blue moon.
Great video! I really appreciate your ability to make these videos highly educational, while also being fun and easily digestible.
I appreciate that!
Have you tried a memory arena/area allocator? Any time that memory can be allocated in the arena, it brings more benefits than you could imagine (okay, you could imagine it). No leaks, no per-object dealloc required, no calls to malloc... It's like writing in Python, but it's in C.
You could use segregated free lists. I did this same project in my Computer Systems class!
Virginia Tech I assume? (just bc no other college I've seen calls it computer systems xd)
Could you do a similar video but showing us what’s in the heap? It would be cool if you could print out a table with everything in the heap
not to downplay this but my first cs elective in low level systems had us write our own memory allocator. it's a standard project in most curriculums
How about adding expiration time for each allocated memory (for example in milliseconds). This way expired memory could be easily reassigned.
2:00 "If they ask for more memory than the Heap size
too bad so sad get the fuck out of here."