You can also use a header files to declare structs w/o exposing their fields (you define them in the source file). That way you ensure that users of your library operate on structs only through pointers to them and API you provided, so you achieve encapsulation.
@EdKolis it's not security, it's what the developer is able to use in source code. Of course there's ways to get around it, but the point is just to make it harder to do something you don't want them doing.
Something you didn't mention is that header files are included in the precompiler phase. The line _#include__ "myheader.h"_ is basically an instruction that tells the precompiler to replace this line with the contents of _myheader.h_ . This is why header inclusion is a # command, and also why headers start with a #ifndef command, to make sure that the same header isn't included more than once by the precompiler. It also means that you don't have to limit yourself to declare functions on the header file, you can technically write any code inside a header file and it will compile just fine, though it can lead to problems if multiple source files use the same header (multiple definition error).
> though it can lead to problems if multiple source files use the same header (multiple definition error). But why? Doesn't #ifndef/#pragma once fix this?
@@dimitar.bogdanov the ifndef/pragma prevents you from getting multiple definitions in one translation unit, but if you link with another compiled source file that included the header then you would basically be including that code twice, because each source file has its own copy of the header
Header files are only there because it makes it easier for the parser. They are technically not needed but this is the legacy of C and other languages so it's stuck around. I don't mind them personally but it's hard to justify their existence because removing them is just an technical problem which can be solved with a little effort.
I was just thinking the same. If one is naughty, one could completely ignore header files and just use implicit declarations all around. C could use an actual module system like Rust but I suspect that's not going to happen any time soon. I heard C++20 introduced a module system, but then you're programming in C++ .... or some subset of it.
@@giantskeleton420 the compiler would just create an implicit declaration for any external function, which will always look like int functionName() (always an int and empty parameters). The linker will then just try to match the function name to what it can find in the libraries. If you pass the correct arguments to the functions and handle its return value correctly, everything should work just fine.
Dealing with header files in c/c++ feels like doing the job of the compiler. I understand the need of a header file when linking with dynamic/static library. BUT in 99% of header files I wrote I also wrote the source file.
The header files will be useful if you get a new task that takes say 6 months and then need to return to the task, and also if you rise in the career and someone else needs to understand what you coded. Just regard them as your own well structured notes for yourself (and others).
@@rursus8354Yes, that is what comments are for. In the best case, the header just says what the source file says. In the worst case, the header is a complex maze of ifdefs that are impossible to navigate.
Be careful with those $(pwd) calls, they undergo bash word expansion and can be broken into two different arguments if your path has a space in it, always quote those things with double quotes ", same if you were to do "$PWD" (just reading the variable PWD instead of running the pwd command in a subshell)
I know someone said you are a gem for embedded, but you are also one for game dev and low levelers with hardware restraints. I love it. I am just a little to deep, but I saw 'header file' reading the raylib documentation as being modular and interchangable. And I would be lying to say I had any clue what a header file or what a C file with a header file really...means. This video clears up alot of over complicated imagined semantics. Thanks!
include just copy pastes it into source files. declarations just tell the compiler that the linker will take care of it later so it can generate a stub for the linker. headers just give one convenient place for these declarations that just get included in each file. that way if you have to update it its in one place and all other files will just include that header
can’t think this man enough for his absolutely no BS approach to systems computing/programming. he could have done 2 videos on zig or 3 videos on rust and just touched the surface of those .. for content. but he goes deep into basics of how real world systems works … not BS toy projects. thank you you kind man.
This reminds me of my earliest programming experience where it was still common to do the compile and linking semi manually. It was also common to have an automation, but still when I started, and this was not as long ago as you might think, manually compiling and linking simpler programs was still considered common practice.
At first, I didn’t appreciate headers, I considered them cumbersome. I always thought that you’re repeating yourself. Then I started writing a rendering engine, and oh are they a lifesaver. Declaring one header file that’s used across 4 APIs has saved me a lot of time and effort instead of writing one for each.
Technically, in the last step you could just list all your .c files instead of .o files, but it would mean you have to compile both files every time even if only one of them changes.
Basically the implicit function declaration already knows the arguments and return of the function. If compiled to assembly/binary, it can pass its parameters into appropriate registers and read back an expected return from the return register. But the binary also needs to know the address of the function, to jump there and continue with the code execution of that function. To find out this address, the files need to be linked. If no matching function can be found at this time, the linker can only give up. In many modern strongly typed languages and IDEs, we would expect the IDE to already have done that lookup in real time as we type the code, so we may not even be allowed to start compiling.
Instead of "why do header files even exist" you explained how header files work, but the question from the title of the video still stands. You may have noticed that header files are almost exclusive to C/C++ languages, other languages somehow don't need them. So why do header files even exist? For those wondering, header files exist mostly for historical reasons. Since memory was very limited back when C was developed, compilers couldn't afford to keep track of modules themselves, so it became the job of the programmer. Modern compilers are far less hardware restricted, which allows them to favor developer experience over efficiency.
how are they different from libraries? For example in Python we have to import/specify libraries we'd like to use too. Header files have additional functions built inside them if I am not wrong?
@buak809 , well they most of the time aren't, the main difference is you have to write them yourself instead of the compiler figuring it out of your code
For future viewers.. newer compilers will treat 1:39 as an error (as if -Werror=implicit-function-declaration) not as a warning as t is invalid c99..the link step will not ocurr.
For data that is not self-describing, you need a way to share the common structure across multiple source code files. It's also helpful for sharing common source processing directives. This concept originated before computers with "boilerplate" text and copy books and the term "copy book" was adopted by Amazing Grace for Cobol.
Another interesting observation is that a C header file works as an "interface" type. Whereas in C++ or Python "base class", a Rust trait, or a C# interface, in C you have .h files. It describes any object types (structs with or without typedefs) and the (virtual) procedures that are available in the module. An accompanying .c file or Feature Test Macro (for single-implementation header files) can be added to choose the implementation. This way, a .h file can define a bunch of functions, but for instance you could implement one .c file per platform. The .h file is the abstract class/interface/traits, the .c file is the derived class/inteface implementation/impl. C was data-object oriented driven from the start.
3:14 (yes pi!) There is 3 layers for compile. Preprocessor, Compiler, and Assembler. First all preprocessor tokens are expanded, then its turned into assembly language, then turned into .o files
Btw, for linker paths in e.g. proprietary programs, patchelf from the NixOS project exists. It lets you change paths to libraries in standard Linux ELF files, which they have to use as that distro doesn't use /usr/lib.
I’ve never found header files an issue, if anything I find them a blessing. A nice readable prototype of an implementation for a function that I don’t want to know the implementation details from. I hear everybody complain about it and it makes no sense to me as to why. I hate those people who do implementations also in header files.
Just to add if you don’t mind. I remember Bob C. Martin and someone else explains header file being similar to Go/Java interface and it could be made to serve the purpose of hiding private functions. Lately though I have doubts about the abstraction layers itself. I encounter most of the times hiding the details behind interfaces makes behaviour of code harder to understand but that’s anecdotal. While at the same time, there is no way to code without abstraction.
@@candrar2866 nice addition! Usually I don’t hold uncle Bob in a very high regard but a header can be seen as an interface definition. Now that’s the difference between Java and C++ you have all the definition public, private, protected. Which shouldn’t be needed really but back in 1970 when C was created the sole purpose was to develop operating systems. Those systems didn’t have the power to quickly scan binaries to find functions and bind to them. Because reading binaries was so slow. So basically the whole header files started as a helper for underpowered linkers. These days we can do that without effort and the idea of a header file is dropped in all modern languages. But I do like a header file strictly as an interface like Uncle Bob sees them. You get a binary library and a header file and you know exactly how to call the functions.
C learner here. Great video. I use #pragma once and intend to have declarations in the header files, but I often struggles in a projects, where I have multiple .c and .h files. I think of two ways to deal with that. Use static more and move some of the declarations to the related .c file.
"Declaration" and "definition" have very specific meanings in C. You got it right at one point in the video, where you said the declaration is in the header but not the definition, but you kept using the wrong terms throughout the rest of the video. K&R assumed an implicit declaration of unknown functions as returning an int - parameters were not part of the signature - but ANSI made declarations for functions mandatory and added parameters.
Isn't it simply because these source files actually don't know about each other? Each unit is compiled separately, only linking at the ends brings those units together.
To be honest I really like the video. However the title is clickbait or just plain inaccurate. You explained how they work, but not why they're there. Please rename it to how header files work or something similar. All of what you talked about in the video is solved in modern languages without the use of header files and I feel like this topic could really be made into a much more in-depth video.
I think explaining how the compiler isolates compilation units is sufficient. He is explaining "why?" in the context of "why does the compiler need header files?". What you are looking for is "Why was the C and C++ compiler designed with those restrictions?" but that is another topic, and one concerning a more historical perspective. Also, now that you know of these restrictions thanks to him, you can look it up!
My first Cpp project was a game-engine following theCherno’s tutorial to start and I learned how to use header files kinda naturally i never even thought about “why” im doing i just new when to use it
Thank you very much, I literally dealt with tons of compiling issues when programming in go using the FFI and realized that I have never understood it correctly.
Headers are basically here to make compiling easier. They're a remnant of the past that stuck around. You see, the early C compilers were dumb so if you tried to call a function that was declared later on in the source, they couldn't find it. Sometimes you could solve this by shuffling the functions around, but sometimes it didn't work when the functions depended on each other. (e.g. Function A needs to call function B that in turn either calls Function C or function B again in some recursion scenario). To solve that dumbness of the early C compilers, the header files were invented where all the functions would be declared beforehand so the compiler won't think that a function that exist, doesn't. Nowadays, the compilers such as GCC and MSVC are actually pretty smart about declarations and can find the function that was declared later on in the code, but the header files stuck around.
A great way to show this is with objdump output on .o files, .so files, and executables. It might be fun to do one about why the order of linker arguments matter (it's an iterative process!)
Header fights create a prototype declaration of the interface to a full actual function. Only at link time you'd need the code for the function. Ultimately the whole compilation process will be faster, and gives you a full correctness check before linking.
I don't think the gcc call is enough since lowlevelmath is a shared/dynamic library. During run-time you got to tell the link loader where to find all necessary shared libs. It's kind of like the PATH environment variable for libraries
No, because the add function was in a shared object (.so) which is loaded at runtime (dynamic linking). Later in the video he uses .o files (static linking) so it's not necessary. He kinda glossed over that bit.
How dynamic libraries such as *.so / *.dll are found is kind of platform specific. e.g. in windows DLL's are automatically searched in the application folder and a bunch of other folders (super unsafe. google dll injection). On Linux you can embed a "RPATH" into your executable that tells linux where to find the *.so file or you use the environment variable as shown in the video. Note that this gets even more complicated: Android, iOS and co all use "rpath" but in very different ways.
Very well rounded discussion of header files. Although, you could have added more detail like pragma options and such, I like the way you discussed it since it's easier for beginners to understand without all of the details they can learn later.
I *really* wish this video (or something equivalent) had existed when I first started learning C in the 80s, because this stuff confused the hell out of me at the time, & was never explained properly in the texts I was learning from.
I am currently porting some stuff to rust that uses a headerfile that you are supposed to include multiple times, changing other state betweentimes so that you can declare multiple versions of a struct. This is pure insanity and should never have been posible.
3:11 Can you have another look at this? I think this was suppose to say "link -> dll" for windows? As far as I know static libraries (*.a / *.lib) are not linked at all. e.g. on linux the command "ar" will just bundle several *.o files into a single *.a. No linker involved. It would be great to see a deep dive in how static and dynamic libraries work in detail.
One thing is that all the code in the header files does get compiled in with the rest of the code which is why really big header files, or source code files with lots of headers take a while to compile since there's a lot of hidden code being analyzed by the compiler.
Please remember 40 years back making it easier for computer to compile was extremely important. I remember old grumpy programmers complaining about inefficiency of high level languages (C)
A video explaining pointers in the context of structs and vectors being passed to function would be great (What is the best way to access the value of a struct/array and assign it to another struct or variable inside of a function for example...), also one on cmake and best practices when developing with C would be awesome as well, just some suggestions! Thank you for the helpful videos!
So .h files exist to make it possible to hide the definition of the functions in the .c files. Header files allow the programmer to use the function without seeing the internals of the function. The provider of the code can send .o binaries with the functions, instead of the .c file and send a .h as a "manual" on how to use the black boxed function. But everything is open source if you read assembly.
I wouldn't say they only exist to hide function definitions. If you're compiling multiple source files into multiple object files, having a place to store commonly shared declarations like functions/enums/etc. makes more sense than having declarations being repeated in every source file.
Nice video. Maybe you do some videos about this topic in general in the future, so I have some constructive criticism (even if it was a while ago). If you don't, then maybe these points will shine some light on things for other people: - you could mention, that a #include directive really does what the name tells you: it INCLUDES the file after the "#include" in the file, where the '"include" is. This means, that it COPIES the entire file into your file. I personally find this (in my opinion very smart) solution to the problem of declaring functions quite interesting. - you could SHOW some assembly created without optimization. On the one hand from the .obj files, on the other hands from the linked executables/libraries. When I saw that for my first time, a lot of things got more clear for me. - you could somehow illustrate the ACTUAL linking of symbols (and what does "symbol" even mean? 🙂 ). Maybe this is somehow a difficult topic, nontheless very interesting. - this last point is also a good cliffhanger to go IN-DEPTH on how the CPU really "executes" functions. How do parameters work, how does the OS and consequently the CPU "know", what to return to where, and what to take from where to what as arguments?
When I was doing C I used unity build. Instead of using header files I just included .c files directly. I had my main .c file that had the main loop, and included all the other files into that. It worked beautifully. This method has a downside and an upside. Downside is that you can't tell what each file depends on. The upside is that you have to figure out how to avoid a circular reference, which makes your code more disciplined and organized.
according to Wikipedia of C's history: *The preprocessor was introduced around 1973 at the urging of Alan Snyder and also in recognition of the usefulness of the file-inclusion mechanisms available in BCPL and PL/I. Its original version provided only included files and simple string replacements: **#include** and **#define** of parameterless macros. Soon after that, it was extended, mostly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation.* So if you hate the preprocessor, blame Alan Snyder.
I'm definitely missing something here, but I'm struggling to understand how the .so file is "closed" enough to be a format that e.g. a manufacturer would use to distribute proprietary instructions that they don't want to be easily readable, but "open" (i.e., human-readable) enough that the linker can locate a specific function by name.
I haven't dealt with shared objects when hacking N64 games so I'm not 100% confident on how they really work, but I imagine that on a machine that the user has full access to everything, there isn't really any sort of built-in security beyond the tiny bit of obfuscation that happens when C is translated to machine code. However, it's possible to create a machine where only some parts of the are accessible to the user, while others are intended to be trade secret to the manufacturer (think secret co-processers on the main CPU type stuff, which I believe is how the more modern Nintendo consoles try to achieve security). One possible implementation would be to design the hardware so that only the secret co-processor has direct access to the hardware resources... so like the shared object could be distributed publicly in an encrypted form that only the co-processor knows how to decrypt, and the co-processor or even just the hardware design could prevent the user from directly accessing the decrypted object file.
You can do it with the -aux-info option of gcc. You'd have to do this in the Makefile similar to how you can let the Makefile invoke gcc to generate dependencies. However, the resulting header files would contain definitions that you don't want to expose to other source files. It is possible to do it, but I don't think any serious C programmers ever do this.
@@DeveloVooshGWeb it's not based on difficulty, it's based on level of abstraction. High level languages are called that because they're at a higher level of abstraction.
include just copy pastes so really its just a place to put symbol declarations that are shared across different compilation units. object files contain stubs inserted by the compiler and the linker resolves these stubs to some actual memory address at some point. either static or dynamic linking. also i always put extern to make the intent clear for readability even if its not needed. extern int x; this forward declares a global symbol x which will have some address that will be inserted later. its the same for functions. a function is just a memory address the decoration around it is just so the compiler knows what parameters and how to pass things on the stack and what the return is. extern int Add(int, int); the actual definition could be anywhere as long as its part of one of the files that the linker will pull in.
Hello LLL! Could you explain what happens in the linking process? What does it do to the object file? How can the executable locate the library code at runtime to execute it? I like to know how this whole world of references and dependencies works in Linux. Thank you very much for sharing your knowledge, I really appreciate it.
Dynamic linking on Linux is done by ld.so. The object files are in the ELF format, which contains information about the symbols that still need to be resolved.
I was expecting this to answer why they decided to create these standard headers and not bake what is in them into the language. I assume it's a historical thing, possibly combined with just how low level C is intended to be. They didn't want to have code that is only added if it is used baked into the language itself.
That shows how they work, but I never really got why the compiler didn't just handle all of that to begin with. Why is another file necessary when it's just duplicating code that was already written? Wouldn't it make more sense to DRR?
You can let gcc generate a file with all declarations (using -aux-info). But in practice the programmer wants to control what declarations the other source files see. The header file is the interface to the source file, and it is a good thing to think about what you want to be in the interface. Declarations that are purely internal to the source file should be kept in the source file.
ok but why not make the .c files compile into libraries that can be read as a header files by LSP/Compiler? Like it already needs to have most of the info, some function signatures aren't gonna break it zzz and you could statically compile libraries or dynamically include them. headers are sometimes cool, but usually it's just more code for same thing.
If a source file include "its own" header file, you can be sure that the compiler will flag any inconsistencies between the function declarations in the source file and the header file. Also, if the header files declares structs and types, you won't have to repeat those declarations in the source file.
wanna get good at programming? check out lowlevel.academy and use code THREADS20 for 20% off lifetime access. or dont. im not a cop
idk man
+1
same dude.
same i'm literally shaking holding my JS teddy bear
now you know man
great
You can also use a header files to declare structs w/o exposing their fields (you define them in the source file). That way you ensure that users of your library operate on structs only through pointers to them and API you provided, so you achieve encapsulation.
public in any language: hello
Sounds like security by obscurity. Someone will eventually guess the names of your "private" fields!
@EdKolis it's not security, it's what the developer is able to use in source code. Of course there's ways to get around it, but the point is just to make it harder to do something you don't want them doing.
Is that what they call the Pimpl pattern?
Won't it cause allocation problems? When sizeof(struct) lies to you about real size? Or am I wrong somewhere?
Something you didn't mention is that header files are included in the precompiler phase. The line _#include__ "myheader.h"_ is basically an instruction that tells the precompiler to replace this line with the contents of _myheader.h_ . This is why header inclusion is a # command, and also why headers start with a #ifndef command, to make sure that the same header isn't included more than once by the precompiler. It also means that you don't have to limit yourself to declare functions on the header file, you can technically write any code inside a header file and it will compile just fine, though it can lead to problems if multiple source files use the same header (multiple definition error).
inline my beloved
Headers could also start with a "#pragma once" command.
> though it can lead to problems if multiple source files use the same header (multiple definition error).
But why? Doesn't #ifndef/#pragma once fix this?
@@dimitar.bogdanov the ifndef/pragma prevents you from getting multiple definitions in one translation unit, but if you link with another compiled source file that included the header then you would basically be including that code twice, because each source file has its own copy of the header
@@davidfrischknecht8261 #pragma once is nonstandard.
Header files are only there because it makes it easier for the parser. They are technically not needed but this is the legacy of C and other languages so it's stuck around. I don't mind them personally but it's hard to justify their existence because removing them is just an technical problem which can be solved with a little effort.
I was just thinking the same. If one is naughty, one could completely ignore header files and just use implicit declarations all around.
C could use an actual module system like Rust but I suspect that's not going to happen any time soon. I heard C++20 introduced a module system, but then you're programming in C++ .... or some subset of it.
i dont get it, how would certain functions run then?
@@giantskeleton420 the compiler would just create an implicit declaration for any external function, which will always look like int functionName() (always an int and empty parameters). The linker will then just try to match the function name to what it can find in the libraries. If you pass the correct arguments to the functions and handle its return value correctly, everything should work just fine.
@@kerimgueney Wow, did not know that. Thanks for info!
@@kerimgueneyso u could just include another .c file with the definitions and wouldnt need a header file?
Dealing with header files in c/c++ feels like doing the job of the compiler. I understand the need of a header file when linking with dynamic/static library. BUT in 99% of header files I wrote I also wrote the source file.
Having implentations in headers leads to longer compile times, separate file.cpp can be compiled to object files indelendent of other code
The header files will be useful if you get a new task that takes say 6 months and then need to return to the task, and also if you rise in the career and someone else needs to understand what you coded. Just regard them as your own well structured notes for yourself (and others).
Dealing with header files is one of the reasons why I decided to go with Java development career instead of C/C++.
@@rursus8354Yes, that is what comments are for. In the best case, the header just says what the source file says. In the worst case, the header is a complex maze of ifdefs that are impossible to navigate.
When C and C++ were developed there wasn't such concept as module
6 years after completing classes, i learned why and "" are used in the includes.
one uses predefined full path and the other is a relative path.
I think i got u beat, i just learned that today....been programming for around 14 years now..........
Be careful with those $(pwd) calls, they undergo bash word expansion and can be broken into two different arguments if your path has a space in it, always quote those things with double quotes ", same if you were to do "$PWD" (just reading the variable PWD instead of running the pwd command in a subshell)
Fuckin' sh, man. How'd it catch on?
There are so many things that break if you have a space in the path for this reason!
I know someone said you are a gem for embedded, but you are also one for game dev and low levelers with hardware restraints. I love it.
I am just a little to deep, but I saw 'header file' reading the raylib documentation as being modular and interchangable.
And I would be lying to say I had any clue what a header file or what a C file with a header file really...means.
This video clears up alot of over complicated imagined semantics. Thanks!
include just copy pastes it into source files. declarations just tell the compiler that the linker will take care of it later so it can generate a stub for the linker. headers just give one convenient place for these declarations that just get included in each file. that way if you have to update it its in one place and all other files will just include that header
can’t think this man enough for his absolutely no BS approach to systems computing/programming. he could have done 2 videos on zig or 3 videos on rust and just touched the surface of those .. for content. but he goes deep into basics of how real world systems works … not BS toy projects. thank you you kind man.
This reminds me of my earliest programming experience where it was still common to do the compile and linking semi manually. It was also common to have an automation, but still when I started, and this was not as long ago as you might think, manually compiling and linking simpler programs was still considered common practice.
Can you make more videos about the building process?
I really enjoyed this one
At first, I didn’t appreciate headers, I considered them cumbersome. I always thought that you’re repeating yourself. Then I started writing a rendering engine, and oh are they a lifesaver. Declaring one header file that’s used across 4 APIs has saved me a lot of time and effort instead of writing one for each.
Technically, in the last step you could just list all your .c files instead of .o files, but it would mean you have to compile both files every time even if only one of them changes.
It would have to be huuuuge to make a real impact in todays processing speeds
Basically the implicit function declaration already knows the arguments and return of the function.
If compiled to assembly/binary, it can pass its parameters into appropriate registers and read back an expected return from the return register.
But the binary also needs to know the address of the function, to jump there and continue with the code execution of that function. To find out this address, the files need to be linked.
If no matching function can be found at this time, the linker can only give up.
In many modern strongly typed languages and IDEs, we would expect the IDE to already have done that lookup in real time as we type the code, so we may not even be allowed to start compiling.
Instead of "why do header files even exist" you explained how header files work, but the question from the title of the video still stands.
You may have noticed that header files are almost exclusive to C/C++ languages, other languages somehow don't need them. So why do header files even exist?
For those wondering, header files exist mostly for historical reasons. Since memory was very limited back when C was developed, compilers couldn't afford to keep track of modules themselves, so it became the job of the programmer. Modern compilers are far less hardware restricted, which allows them to favor developer experience over efficiency.
thank you for explainming
Wow someone explained it. Thanks!
how are they different from libraries? For example in Python we have to import/specify libraries we'd like to use too. Header files have additional functions built inside them if I am not wrong?
@buak809 , well they most of the time aren't, the main difference is you have to write them yourself instead of the compiler figuring it out of your code
For future viewers.. newer compilers will treat 1:39 as an error (as if -Werror=implicit-function-declaration) not as a warning as t is invalid c99..the link step will not ocurr.
For data that is not self-describing, you need a way to share the common structure across multiple source code files. It's also helpful for sharing common source processing directives. This concept originated before computers with "boilerplate" text and copy books and the term "copy book" was adopted by Amazing Grace for Cobol.
Another interesting observation is that a C header file works as an "interface" type.
Whereas in C++ or Python "base class", a Rust trait, or a C# interface, in C you have .h files.
It describes any object types (structs with or without typedefs) and the (virtual) procedures that are available in the module.
An accompanying .c file or Feature Test Macro (for single-implementation header files) can be added to choose the implementation.
This way, a .h file can define a bunch of functions, but for instance you could implement one .c file per platform. The .h file is the abstract class/interface/traits, the .c file is the derived class/inteface implementation/impl.
C was data-object oriented driven from the start.
3:14 (yes pi!) There is 3 layers for compile. Preprocessor, Compiler, and Assembler. First all preprocessor tokens are expanded, then its turned into assembly language, then turned into .o files
Btw, for linker paths in e.g. proprietary programs, patchelf from the NixOS project exists. It lets you change paths to libraries in standard Linux ELF files, which they have to use as that distro doesn't use /usr/lib.
I love your work man, although I know this info already but I'm enjoying watching you explaining it with this much practical details.
I’ve never found header files an issue, if anything I find them a blessing. A nice readable prototype of an implementation for a function that I don’t want to know the implementation details from. I hear everybody complain about it and it makes no sense to me as to why. I hate those people who do implementations also in header files.
Just to add if you don’t mind.
I remember Bob C. Martin and someone else explains header file being similar to Go/Java interface and it could be made to serve the purpose of hiding private functions.
Lately though I have doubts about the abstraction layers itself. I encounter most of the times hiding the details behind interfaces makes behaviour of code harder to understand but that’s anecdotal. While at the same time, there is no way to code without abstraction.
@@candrar2866 nice addition!
Usually I don’t hold uncle Bob in a very high regard but a header can be seen as an interface definition.
Now that’s the difference between Java and C++ you have all the definition public, private, protected. Which shouldn’t be needed really but back in 1970 when C was created the sole purpose was to develop operating systems. Those systems didn’t have the power to quickly scan binaries to find functions and bind to them. Because reading binaries was so slow. So basically the whole header files started as a helper for underpowered linkers.
These days we can do that without effort and the idea of a header file is dropped in all modern languages.
But I do like a header file strictly as an interface like Uncle Bob sees them. You get a binary library and a header file and you know exactly how to call the functions.
I fully agree
You know you can just collapse the function contents in any proper editor to get basically the same information? IIRC it's ctrl-K ctrl-0 in VSCode.
@@imaginerus then you will need to send the code along and compile the whole code. The idea of libraries is that these are already in binary format.
C learner here. Great video. I use #pragma once and intend to have declarations in the header files, but I often struggles in a projects, where I have multiple .c and .h files. I think of two ways to deal with that. Use static more and move some of the declarations to the related .c file.
"Declaration" and "definition" have very specific meanings in C. You got it right at one point in the video, where you said the declaration is in the header but not the definition, but you kept using the wrong terms throughout the rest of the video. K&R assumed an implicit declaration of unknown functions as returning an int - parameters were not part of the signature - but ANSI made declarations for functions mandatory and added parameters.
07:12 Doing `$(pwd)` instead of `$PWD` is like doing `cat file | grep pattern` instead of `grep pattern file` 💀
Dude you have the best vids ever. Engaging AND technical.
Isn't it simply because these source files actually don't know about each other? Each unit is compiled separately, only linking at the ends brings those units together.
Correct.
Great video, I've used C a bit, not loads, and you explained so many things that I just kinda accepted without really fully understanding.
It's funny - when I write C++, I really like that C++ has headers. When I write C#, I really like that it doesn't have headers.
To be honest I really like the video. However the title is clickbait or just plain inaccurate. You explained how they work, but not why they're there. Please rename it to how header files work or something similar. All of what you talked about in the video is solved in modern languages without the use of header files and I feel like this topic could really be made into a much more in-depth video.
Yes. I watched the whole video waiting fior the "why they exist" and it never came.
I think explaining how the compiler isolates compilation units is sufficient. He is explaining "why?" in the context of "why does the compiler need header files?". What you are looking for is "Why was the C and C++ compiler designed with those restrictions?" but that is another topic, and one concerning a more historical perspective. Also, now that you know of these restrictions thanks to him, you can look it up!
They exist so you don't have to copy all those definitions into every source file and maintain them there.
My first Cpp project was a game-engine following theCherno’s tutorial to start and I learned how to use header files kinda naturally i never even thought about “why” im doing i just new when to use it
Actually, I started learning c++ the exact same way, about 2 years ago with his sparky series.
Are you still working on that!?
Thank you very much, I literally dealt with tons of compiling issues when programming in go using the FFI and realized that I have never understood it correctly.
Headers are basically here to make compiling easier. They're a remnant of the past that stuck around.
You see, the early C compilers were dumb so if you tried to call a function that was declared later on in the source, they couldn't find it. Sometimes you could solve this by shuffling the functions around, but sometimes it didn't work when the functions depended on each other. (e.g. Function A needs to call function B that in turn either calls Function C or function B again in some recursion scenario).
To solve that dumbness of the early C compilers, the header files were invented where all the functions would be declared beforehand so the compiler won't think that a function that exist, doesn't.
Nowadays, the compilers such as GCC and MSVC are actually pretty smart about declarations and can find the function that was declared later on in the code, but the header files stuck around.
This made sense to me, thanks!
@@visionshift1 You're welcome
thank you, you not only explained me header files but also the point of using Makefiles towards the end ❤
A great way to show this is with objdump output on .o files, .so files, and executables.
It might be fun to do one about why the order of linker arguments matter (it's an iterative process!)
Header fights create a prototype declaration of the interface to a full actual function.
Only at link time you'd need the code for the function. Ultimately the whole compilation process will be faster, and gives you a full correctness check before linking.
Why did you have to set an environment variable at 7:37? Was the call to gcc at 7:20 not sufficient to make the association?
I don't think the gcc call is enough since lowlevelmath is a shared/dynamic library. During run-time you got to tell the link loader where to find all necessary shared libs. It's kind of like the PATH environment variable for libraries
No, because the add function was in a shared object (.so) which is loaded at runtime (dynamic linking). Later in the video he uses .o files (static linking) so it's not necessary. He kinda glossed over that bit.
How dynamic libraries such as *.so / *.dll are found is kind of platform specific. e.g. in windows DLL's are automatically searched in the application folder and a bunch of other folders (super unsafe. google dll injection). On Linux you can embed a "RPATH" into your executable that tells linux where to find the *.so file or you use the environment variable as shown in the video. Note that this gets even more complicated: Android, iOS and co all use "rpath" but in very different ways.
Very well rounded discussion of header files. Although, you could have added more detail like pragma options and such, I like the way you discussed it since it's easier for beginners to understand without all of the details they can learn later.
This video was just perfect
I like the quote "Everything is open source if you can read assembly" ❤
I just started coding, into this video around 2 min, already subscribed...❤❤
I *really* wish this video (or something equivalent) had existed when I first started learning C in the 80s, because this stuff confused the hell out of me at the time, & was never explained properly in the texts I was learning from.
I have been looking for a good video about this for a long time 😂 thank you!!
I love his chill vibes
I am currently porting some stuff to rust that uses a headerfile that you are supposed to include multiple times, changing other state betweentimes so that you can declare multiple versions of a struct. This is pure insanity and should never have been posible.
9:45 You can compile it in easier method in one go. Instead of creating separate objects, give a gcc all the .c (and .o) files.
thanks that was super informative!
3:11 Can you have another look at this? I think this was suppose to say "link -> dll" for windows? As far as I know static libraries (*.a / *.lib) are not linked at all. e.g. on linux the command "ar" will just bundle several *.o files into a single *.a. No linker involved. It would be great to see a deep dive in how static and dynamic libraries work in detail.
"Returns a Client Star" instead of saying a pointer to a clients struct. This is how terror of pointers starts
Wow, I really like the shirt he's wearing. Assembly language just got more interesting.
One thing is that all the code in the header files does get compiled in with the rest of the code which is why really big header files, or source code files with lots of headers take a while to compile since there's a lot of hidden code being analyzed by the compiler.
Which is why header files should not contain actual code. (They can contain "static inline" functions in C.)
I expected to hear more about the Why and less about the How
great definition, this was the first thing I had to figure out decades ago with circular dependencies
Awesome channel, fills in a lot of my knowledge gaps efforlesly!
Awesome explanation! Thanks!
0:10 "£ include"? That's a new one...
I read somewhere that humans eat more bananas than monkeys and I think it's probably true because I can't remember the last time I ate a monkey.
Wow, this is such great information that I found a week after I needed to know it.
Very helpful video, this helped a lot. Thanks, dude.
Please remember 40 years back making it easier for computer to compile was extremely important. I remember old grumpy programmers complaining about inefficiency of high level languages (C)
A video explaining pointers in the context of structs and vectors being passed to function would be great (What is the best way to access the value of a struct/array and assign it to another struct or variable inside of a function for example...), also one on cmake and best practices when developing with C would be awesome as well, just some suggestions! Thank you for the helpful videos!
Great explanation. Thank you
So .h files exist to make it possible to hide the definition of the functions in the .c files. Header files allow the programmer to use the function without seeing the internals of the function. The provider of the code can send .o binaries with the functions, instead of the .c file and send a .h as a "manual" on how to use the black boxed function.
But everything is open source if you read assembly.
I wouldn't say they only exist to hide function definitions. If you're compiling multiple source files into multiple object files, having a place to store commonly shared declarations like functions/enums/etc. makes more sense than having declarations being repeated in every source file.
Exactly the video I needed.
Nice video. Maybe you do some videos about this topic in general in the future, so I have some constructive criticism (even if it was a while ago). If you don't, then maybe these points will shine some light on things for other people:
- you could mention, that a #include directive really does what the name tells you: it INCLUDES the file after the "#include" in the file, where the '"include" is. This means, that it COPIES the entire file into your file. I personally find this (in my opinion very smart) solution to the problem of declaring functions quite interesting.
- you could SHOW some assembly created without optimization. On the one hand from the .obj files, on the other hands from the linked executables/libraries. When I saw that for my first time, a lot of things got more clear for me.
- you could somehow illustrate the ACTUAL linking of symbols (and what does "symbol" even mean? 🙂 ). Maybe this is somehow a difficult topic, nontheless very interesting.
- this last point is also a good cliffhanger to go IN-DEPTH on how the CPU really "executes" functions. How do parameters work, how does the OS and consequently the CPU "know", what to return to where, and what to take from where to what as arguments?
Thank you for the helpful video! At 7:52, I think you misspoke by saying "defined" instead of "declared".
What are the full path of the folders?
When I was doing C I used unity build. Instead of using header files I just included .c files directly. I had my main .c file that had the main loop, and included all the other files into that. It worked beautifully. This method has a downside and an upside. Downside is that you can't tell what each file depends on. The upside is that you have to figure out how to avoid a circular reference, which makes your code more disciplined and organized.
Great video! Nicely presented! 👌
ok but where do i get this shirt?
Great video, learned a lot.
according to Wikipedia of C's history:
*The preprocessor was introduced around 1973 at the urging of Alan Snyder and also in recognition of the usefulness of the file-inclusion mechanisms available in BCPL and PL/I. Its original version provided only included files and simple string replacements: **#include** and **#define** of parameterless macros. Soon after that, it was extended, mostly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation.*
So if you hate the preprocessor, blame Alan Snyder.
Alan Snyder is my hero!
Can you talk about the pimpl idiom? I've seen it discussed as a way to hide implementation details, but I've never understood it.
I'm definitely missing something here, but I'm struggling to understand how the .so file is "closed" enough to be a format that e.g. a manufacturer would use to distribute proprietary instructions that they don't want to be easily readable, but "open" (i.e., human-readable) enough that the linker can locate a specific function by name.
I haven't dealt with shared objects when hacking N64 games so I'm not 100% confident on how they really work, but I imagine that on a machine that the user has full access to everything, there isn't really any sort of built-in security beyond the tiny bit of obfuscation that happens when C is translated to machine code.
However, it's possible to create a machine where only some parts of the are accessible to the user, while others are intended to be trade secret to the manufacturer (think secret co-processers on the main CPU type stuff, which I believe is how the more modern Nintendo consoles try to achieve security). One possible implementation would be to design the hardware so that only the secret co-processor has direct access to the hardware resources... so like the shared object could be distributed publicly in an encrypted form that only the co-processor knows how to decrypt, and the co-processor or even just the hardware design could prevent the user from directly accessing the decrypted object file.
The question is why modern compilers cannot generate those header files and still rely on manually write them
You can do it with the -aux-info option of gcc. You'd have to do this in the Makefile similar to how you can let the Makefile invoke gcc to generate dependencies.
However, the resulting header files would contain definitions that you don't want to expose to other source files. It is possible to do it, but I don't think any serious C programmers ever do this.
Is this not high level learning? 🤔
Great content - love it!
yeah... and with jokes aside, I start to wonder why low level programming isn't called high level lmao
@@DeveloVooshGWebThen what would you call high level languages? Lunar languages?
@@EdKolis Idk about you dude, just basing on difficulty
@@DeveloVooshGWeb it's not based on difficulty, it's based on level of abstraction. High level languages are called that because they're at a higher level of abstraction.
@@EdKolis Aware of that, just don't know why it isn't based on difficulty instead. It can be confusing to navigate.
very cool tutorials
pls make a video on kernel headers.
How I can get a shirt like yours? I just love the design
love your channel
include just copy pastes so really its just a place to put symbol declarations that are shared across different compilation units. object files contain stubs inserted by the compiler and the linker resolves these stubs to some actual memory address at some point. either static or dynamic linking. also i always put extern to make the intent clear for readability even if its not needed.
extern int x;
this forward declares a global symbol x which will have some address that will be inserted later. its the same for functions. a function is just a memory address the decoration around it is just so the compiler knows what parameters and how to pass things on the stack and what the return is.
extern int Add(int, int);
the actual definition could be anywhere as long as its part of one of the files that the linker will pull in.
man your awesome, you just answer the most intricate questions devs have. I just had this doubt today and you video came in. Great service. Thx again.
“I trusted you but now your words mean nothing to me because your actions spoke the truth.” -Linker
Hello LLL!
Could you explain what happens in the linking process? What does it do to the object file? How can the executable locate the library code at runtime to execute it? I like to know how this whole world of references and dependencies works in Linux.
Thank you very much for sharing your knowledge, I really appreciate it.
Dynamic linking on Linux is done by ld.so. The object files are in the ELF format, which contains information about the symbols that still need to be resolved.
damn! continuity error at 1:33 (the name of the .o file changed 🥲)
THANK YOU I NEED THIS
This is one of those videos with really clickbaity names, but real content!
which linux distro u use and which ide do u use? thx!
Is it possible to do dynamic linking without having a copy of the .so at _compile_ time, only based off metadata?
Not easily, but sometimes you can get it working by creating a dummy library with the right declarations (but no actual implementation).
"client star" - low level learning, 2023. THATS A POINTER!
I was expecting this to answer why they decided to create these standard headers and not bake what is in them into the language. I assume it's a historical thing, possibly combined with just how low level C is intended to be. They didn't want to have code that is only added if it is used baked into the language itself.
Where can I get this T-Shirt?
That shows how they work, but I never really got why the compiler didn't just handle all of that to begin with. Why is another file necessary when it's just duplicating code that was already written? Wouldn't it make more sense to DRR?
You can let gcc generate a file with all declarations (using -aux-info). But in practice the programmer wants to control what declarations the other source files see. The header file is the interface to the source file, and it is a good thing to think about what you want to be in the interface. Declarations that are purely internal to the source file should be kept in the source file.
ok but why not make the .c files compile into libraries that can be read as a header files by LSP/Compiler? Like it already needs to have most of the info, some function signatures aren't gonna break it zzz
and you could statically compile libraries or dynamically include them. headers are sometimes cool, but usually it's just more code for same thing.
We need a video on makefiles!
How did you get your vim? nvim? to look like that? Do you have a tutorial for it?
my first guess is that there are some types that are called the same in different libraries?
Somebody can explain me why using multi-module code require to create a header file, and then needs to include the header file in both?
If a source file include "its own" header file, you can be sure that the compiler will flag any inconsistencies between the function declarations in the source file and the header file. Also, if the header files declares structs and types, you won't have to repeat those declarations in the source file.
how does including the client.h in code.c give access to the definitions in client.c?
Including client. h in code.c gives code.c access to the declarations in client.h.