Back in the stone age when I took CS classes on SunOS workstations, we had to write a simple malloc/free library for one of our classes. One of my classmates went all out and wrote a malloc implementation that was more efficient at freeing up memory than Sun's standard C library. I can't vouch for it being more bug free though. I just ran across this channel and it makes me miss programming. I work in IT but I didn't become a developer/programmer. I am glad I can still follow *most* of this. 😃
Allocators are actually really rewarding to make. I like making arenas and divide parts of the program into groups. This can avoid complicated malloc-free or new-delete strategies as you can just reset an arena when you are done with some problem that needed a few, or many, different allocations. And the next time you need to allocate into the same arena, you don't even need to get memory from the os, you just reuse that memory from the arena you just reset.
NUMA is a good example on why you'd want to write or modify your allocator. If you want to make sure the memory you allocate is local to your current processor, you need to specify numa node on a multi-node system.
Thank you for the video! Could you comment on how this relates to a general Allocator type in C++ (e.g. std::allocator)? Would making a custom implementation of the Allocator type in C++ be a better approach than overloading malloc?
In c++ it's best to use a custom Allocator, it is more idiomatic. Using cstdlib in C++ is considered a code smell, specially malloc and free. You might as well use C in that case.
I found custom allocators very useful for CUDA. You just write a simple custom allocator, and you get all the STL containers working on your GPU! I know there is a CUDA version of the STL maintained by Nvidia, but still
in c++ in dispatch mode you want to replace the new allocator. as you want the memory to be non-pageable -- depending on the driver. especially if your code is running on the paging disk. if (on windows) you are on the paging storage stack you aslo need to make sure you have a backup non pageable heap to make forward progress in the case of extreme memory pressure. so it becomes more compicated, because if the initial allocation fails you want to go into serial allocation mode. which is why low level kernel can be complicated. if using c you simply call the exallocnonpageable functions (windows) to do that. but that also means you need to serialize your ops because your previous allocation can really only be used for io and building mdls. in my code, on boot you create a io packets preallocated specifically for this. but a more generic case is you create a backup heap and make sure anything that touches that heap is marked as non-pageable.
In c++ when you call `new int` it calls the default constructor of int which sets the variable to 0 (Edit: wrong, built-in types don't have constructors but initialization syntaxes that makes them look like they do, so you have to call `new int();` for 0 initialization)
@@leokiller123able it's not a constructor. It is value-initialization. Which just has the same syntax. i.e: int i(42) != string a("something") int i(42) is just another way of saying int i = 42
Excellent as always! I added a global counter and in the malloc function this counter is increased by one. It turned at the end of the program that this counter has the value 646. Hence the malloc function was called that many times! That puzzles me. Could you eleborate on where all these calls come from? For completeness I run this C (not C++) program on a Raspberry 3 compiled with gcc.
If possible, you could make the malloc wrapper take in a line number and file name then just print those. You can wrap the malloc wrapper in a macro, like #define malloc(SIZE) (malloc_wrapper((SIZE), \_\_FILE\_\_, \_\_LINE\_\_)). This would show you where the calls come from. You can additionally use \_\_func\_\_ to get the function name. (I had to add backslashes to stop RUclips formatting. Don't include them in actual code.)
Very cool. I was thinking how else can intercept calls - out of interest. One way is to write your own strace and intercept the system calls. I have been playing with LibVMI too, which was difficult to get up and running and not good on documentation/tutorials. But it is potentially very powerful. Uses include fuzzing and malware monitoring.
It would be better to switch DEBUG to NDEBUG. It is already used for example in "cassert" "assert.h" header file, and some build systems (cmake) defines NDEBUG for Release builds.
@@gregoryfenn1462 well, it's not the same teacher, professor Jacob teaching method is totally different and not to mention the experience he have, it saves a lot of time because you get to know the best practices, I am sure everyone is this channel would want t full course from him, what other good course you might recommend ?
Alternatively, one could typedef the function type directly: `typedef void *malloc_like_function(size_t);` and next use `malloc_like_function*` whenever a function pointer is desired.
"If you've been programming for more than few hours you'd probably used it". In the meantime, I have about 5000 slocs in my game written in C without a single dynamic allocation. 🙂
Legacy maintenance as well. There is tonnes of old C code that isn't going to be replaced anytime too soon, and you need someone to make sure it's maintained.
Isn't it better to just avoid using uninitialized memory? For local variables const and auto already make this kind of error impossible; for heap allocation - make_shared and make_unique.
In c++ we have default constructors, so when you write `int i;` for example, i is automatically set to 0 (its the same as `int i();`) which is not the case in c as there are no such things as constructors/destructor. So in c++ you can't have uninitialized values (except if you explicitly call malloc or some other allocator than new for pointers )
@@leokiller123able Wrong in multiple ways. EDIT: "class_name object();" on its own does not run the default constructor, "class_name object;" does. For global and static variables, `int i;` will indeed initialize to zero, for *local* variables the value is undefined. Not to mention it is bad style to rely on a variable (maybe) being initialized to zero. It makes the code more readable to just spell it out: int i = 0; or, much better in C++, auto i = 0; because auto makes it impossible to "forget" to spell out the "= 0" part.
@@betareleasemusic are you sure about that? Because for class types when you just declare `class_name variable_name;` it calls the default constructor and I believe that it's the same for built-in types in c++, it feels wrong that it isn't and I always thought it was the case, but sorry if I said something wrong
@@glee21012 i dont disagree, it does this mainly for backward compatibility reasons. but theres not much point in writing your own malloc to initialise values like this in C++ when you can do it as part of the language
@@privileged6453 As other people have said, this technique can be used for other purposes, like finding memory leaks. In malloc, you can register the file and line that it was called from (using the preprocessor) to some kind of dictionary, and then when you free the memory, remove it from the dictionary. Then at the termination of your program you can print out all the data in the dictionary to catch when you aren't freeing your memory.
Great content. Had never seen something like this. Also I wanna mention that this code was almost pure C. I see no reason for you to promote it as C++ code. It has hardly any kind of resemblance to C++.
Thanks. Always glad when I can show someone something new. And, please see my video about C/C++ and my opinions about what gets to be called C++. Spoiler: if it compiles with a C++ compiler, it's C++ (by definition).
@@JacobSorber It's true. But this code is still more C-like rather than C++. You know that modern C++ looks quite different. Your code can't be considered as proper modern C++ code since it uses deprecated features like deprecated headers such as stdio.h and string.h (these are deprecated in C++ and the equivalents cstring and cstdio are strongly recommended by the committee). As a software engineer said in an answer to a Quora question, if you want to write C then write in pure C. Not C++. Also Compiling with a Cpp compiler doesn't have any benefits for C code as far as I know. C compilers might even generate faster code in some cases in which C differs from CPP.
I think it is disingenuous to say that this is a C/C++ trick: 1. dlsym() is not part of either the C or C++ standard library. It is a UNIX system call. It doesn't have the portability C or C++. 2. It solves a problem that you would never encounter in C++ if you followed the C++ Core Guidelines.
I hope you mean "imprecise" rather than "disingenuous", unless of course you think that I'm intentionally leading people astray. You are right that dlsym() is not part of the language standard (it is part of the POSIX standard), and while we're being precise, it's a library call, but not a system call (at least on Linux and MacOS). But, replacing your allocator is something that you can do (for debugging purposes) on just about any OS using C and C++, which was the point I was trying to make. Sorry if that wasn't clear. As for #2, that's like saying, "you don't need to debug code if you just follow best practices and always write correct code." I suppose it's true, but it doesn't actually seem to decrease the number of bugs out there or impact the need for debugging.
Back in the stone age when I took CS classes on SunOS workstations, we had to write a simple malloc/free library for one of our classes. One of my classmates went all out and wrote a malloc implementation that was more efficient at freeing up memory than Sun's standard C library. I can't vouch for it being more bug free though.
I just ran across this channel and it makes me miss programming. I work in IT but I didn't become a developer/programmer. I am glad I can still follow *most* of this. 😃
You can also put printf's in there to see when you are allocating and freeing memory (helps to find leaks) in the console. Good video.
But printf could use malloc, and calling malloc inside malloc is a really bad idea.
@@maxsilvester1327 You can guard against re-entrancy
Interesting way to wrap standard functions. Very useful.
Allocators are actually really rewarding to make. I like making arenas and divide parts of the program into groups. This can avoid complicated malloc-free or new-delete strategies as you can just reset an arena when you are done with some problem that needed a few, or many, different allocations. And the next time you need to allocate into the same arena, you don't even need to get memory from the os, you just reuse that memory from the arena you just reset.
NUMA is a good example on why you'd want to write or modify your allocator. If you want to make sure the memory you allocate is local to your current processor, you need to specify numa node on a multi-node system.
E
This Prof has made CS interesting! Thanks!
isn't dlsym platform specific ?
i'm not sure it's usable on some platforms that for instance do not support any dynamic library loading (like the 3DS)
Amazing video. This topic is very interesting
One cool thing about Zig is that you choose your memory allocator.
But one issue with zig is the entire language
@@user-ux2kk5vp7m how so? Seems pretty good to me
Fantastic tip! Thanks for sharing
Wow! Super helpful... thank you!
Thank you for the video! Could you comment on how this relates to a general Allocator type in C++ (e.g. std::allocator)? Would making a custom implementation of the Allocator type in C++ be a better approach than overloading malloc?
In c++ it's best to use a custom Allocator, it is more idiomatic. Using cstdlib in C++ is considered a code smell, specially malloc and free. You might as well use C in that case.
You can also overload the new operator
I found custom allocators very useful for CUDA. You just write a simple custom allocator, and you get all the STL containers working on your GPU! I know there is a CUDA version of the STL maintained by Nvidia, but still
10:15 - thanks, man. Like bugs I already found was not enough for me...
in c++ in dispatch mode you want to replace the new allocator. as you want the memory to be non-pageable -- depending on the driver. especially if your code is running on the paging disk. if (on windows) you are on the paging storage stack you aslo need to make sure you have a backup non pageable heap to make forward progress in the case of extreme memory pressure. so it becomes more compicated, because if the initial allocation fails you want to go into serial allocation mode. which is why low level kernel can be complicated. if using c you simply call the exallocnonpageable functions (windows) to do that. but that also means you need to serialize your ops because your previous allocation can really only be used for io and building mdls. in my code, on boot you create a io packets preallocated specifically for this. but a more generic case is you create a backup heap and make sure anything that touches that heap is marked as non-pageable.
It initilized *p2 to 0 because (i think) new calls the constructor, in the case of int its 0
In c++ when you call `new int` it calls the default constructor of int which sets the variable to 0
(Edit: wrong, built-in types don't have constructors but initialization syntaxes that makes them look like they do, so you have to call `new int();` for 0 initialization)
The default constructor of int? As far as I know, base types don't have any constructors.
@@nikitabelov1478 they do in c++, otherwise you couldn't do `int i(42);` for example
@@leokiller123able it's not a constructor. It is value-initialization. Which just has the same syntax.
i.e:
int i(42) != string a("something")
int i(42) is just another way of saying
int i = 42
Excellent as always! I added a global counter and in the malloc function this counter is increased by one. It turned at the end of the program that this counter has the value 646. Hence the malloc function was called that many times! That puzzles me. Could you eleborate on where all these calls come from? For completeness I run this C (not C++) program on a Raspberry 3 compiled with gcc.
If possible, you could make the malloc wrapper take in a line number and file name then just print those. You can wrap the malloc wrapper in a macro, like #define malloc(SIZE) (malloc_wrapper((SIZE), \_\_FILE\_\_, \_\_LINE\_\_)). This would show you where the calls come from. You can additionally use \_\_func\_\_ to get the function name. (I had to add backslashes to stop RUclips formatting. Don't include them in actual code.)
You could add a breakpoint inside your custom malloc function and look at the call stack to see where it's being called from.
Very cool. I was thinking how else can intercept calls - out of interest. One way is to write your own strace and intercept the system calls. I have been playing with LibVMI too, which was difficult to get up and running and not good on documentation/tutorials. But it is potentially very powerful. Uses include fuzzing and malware monitoring.
I swear I remember a video where he recreates Malloc but can’t seem to find it. Which was it?
New actually calls the 'constructor' for int. Which always initializes it to zero by default.. which is why you weren't getting a warning...
I have started using backtrace, but this also looks interesting
It would be better to switch DEBUG to NDEBUG. It is already used for example in "cassert" "assert.h" header file, and some build systems (cmake) defines NDEBUG for Release builds.
When there is going to be a full C / C++ course ? Professor Jacob
There’s millions of them online,
@@gregoryfenn1462 well, it's not the same teacher, professor Jacob teaching method is totally different and not to mention the experience he have, it saves a lot of time because you get to know the best practices, I am sure everyone is this channel would want t full course from him, what other good course you might recommend ?
Any recommendations on a book to learn how to write your own allocator?
Alternatively, one could typedef the function type directly: `typedef void *malloc_like_function(size_t);` and next use `malloc_like_function*` whenever a function pointer is desired.
Dr. Sorber, these videos are indeed instructive to me.
That's always good to hear. Glad I could help.
could we make this code os independent?
Yeah, probably.
just confirming: vs still puts safety nets inside and around allocated memory (0xCC)
"If you've been programming for more than few hours you'd probably used it". In the meantime, I have about 5000 slocs in my game written in C without a single dynamic allocation. 🙂
Nice. 😂
What r the jobs u do C for a living except a professor?
Embedded development eg for flight ✈️ control systems - that’s my job
Legacy maintenance as well. There is tonnes of old C code that isn't going to be replaced anytime too soon, and you need someone to make sure it's maintained.
Hmm, isn't there a way of getting the sysmalloc only once, instead of in each call of the new malloc?
This looks like horribly inefficient.
Really interesting video, but don’t think I’d have called it C++ - just looks like C with a .cpp extension
Isn't it better to just avoid using uninitialized memory? For local variables const and auto already make this kind of error impossible; for heap allocation - make_shared and make_unique.
That’s the issue, you can forget to initialise a variable
In c++ we have default constructors, so when you write `int i;` for example, i is automatically set to 0 (its the same as `int i();`) which is not the case in c as there are no such things as constructors/destructor. So in c++ you can't have uninitialized values (except if you explicitly call malloc or some other allocator than new for pointers )
@@leokiller123able Wrong in multiple ways.
EDIT: "class_name object();" on its own does not run the default constructor, "class_name object;" does.
For global and static variables, `int i;` will indeed initialize to zero, for *local* variables the value is undefined.
Not to mention it is bad style to rely on a variable (maybe) being initialized to zero. It makes the code more readable to just spell it out:
int i = 0;
or, much better in C++,
auto i = 0;
because auto makes it impossible to "forget" to spell out the "= 0" part.
@@betareleasemusic are you sure about that? Because for class types when you just declare `class_name variable_name;` it calls the default constructor and I believe that it's the same for built-in types in c++, it feels wrong that it isn't and I always thought it was the case, but sorry if I said something wrong
@@leokiller123able Yes, but `int` is not a class in C++...
obviously In c++0x nobody would use new like that, you’d just do ‘int *x = new int()’ for an int initialised to 0
new uses malloc() under the hood, delete uses free()
@@glee21012 i dont disagree, it does this mainly for backward compatibility reasons. but theres not much point in writing your own malloc to initialise values like this in C++ when you can do it as part of the language
@@glee21012 not to mention that free doesnt have to use malloc, if you overload it to prevent it from doing so.
@@privileged6453 As other people have said, this technique can be used for other purposes, like finding memory leaks. In malloc, you can register the file and line that it was called from (using the preprocessor) to some kind of dictionary, and then when you free the memory, remove it from the dictionary. Then at the termination of your program you can print out all the data in the dictionary to catch when you aren't freeing your memory.
@@privileged6453 nothin worse then using free on a bad pointer - I KNOW lol
with Zig is more easy to modify and create custom allocators
Great content. Had never seen something like this. Also I wanna mention that this code was almost pure C. I see no reason for you to promote it as C++ code. It has hardly any kind of resemblance to C++.
Thanks. Always glad when I can show someone something new.
And, please see my video about C/C++ and my opinions about what gets to be called C++. Spoiler: if it compiles with a C++ compiler, it's C++ (by definition).
@@JacobSorber It's true. But this code is still more C-like rather than C++. You know that modern C++ looks quite different.
Your code can't be considered as proper modern C++ code since it uses deprecated features like deprecated headers such as stdio.h and string.h (these are deprecated in C++ and the equivalents cstring and cstdio are strongly recommended by the committee).
As a software engineer said in an answer to a Quora question, if you want to write C then write in pure C. Not C++.
Also Compiling with a Cpp compiler doesn't have any benefits for C code as far as I know. C compilers might even generate faster code in some cases in which C differs from CPP.
C style c++ is an actual thing, doom 3 was written like that.
Don't really like this for Linux, it's much better to use valgrind, since this catches 95% of memory bugs (from my experience)
1'st
569 likes... _nice..._
I think it is disingenuous to say that this is a C/C++ trick:
1. dlsym() is not part of either the C or C++ standard library. It is a UNIX system call. It doesn't have the portability C or C++.
2. It solves a problem that you would never encounter in C++ if you followed the C++ Core Guidelines.
I hope you mean "imprecise" rather than "disingenuous", unless of course you think that I'm intentionally leading people astray. You are right that dlsym() is not part of the language standard (it is part of the POSIX standard), and while we're being precise, it's a library call, but not a system call (at least on Linux and MacOS). But, replacing your allocator is something that you can do (for debugging purposes) on just about any OS using C and C++, which was the point I was trying to make. Sorry if that wasn't clear. As for #2, that's like saying, "you don't need to debug code if you just follow best practices and always write correct code." I suppose it's true, but it doesn't actually seem to decrease the number of bugs out there or impact the need for debugging.