Thanks a lot for making this video. Very detailed and precise at the same time. The best video one can create to explain Static and dynamic linking in detail.
This is really great, so clear and the help of small diagrams were really helpful. Even my university cannot do a better work in making student learn as you. thanks a lot for your work. Hope you will upload more soon
@@embeddedarmdev I have a question. If I create static library that has to be linked against some libraries in my system (*.so), when I later want to use an executable that links against the created library, do I also have to explicitly link the executable with said system libraries, or is the information stored inside (i.e. my_lib.a)?
@@miguelg.s.1246 This is an interesting question. Just to be sure I understand what you are asking. You have on your system a shared library (let's call is libtest.so). You create a static library called my_lib.a that links into libtest.so. Now, you want to create a program (let's call it my_app) in which you want to use functions from your static library my_lib.a (which link into the shared library). And you want to know if when you compile my_app.c would you have to link into the shared object. If that is your question, the answer is yes, you would have to explicitly link with the library libtest.so when you compile my_app.c. The static library (.a) is simply just a collection of compiled objects, there is no linking information in it. When you create the archive, you are not actually linking the archive with the shared object. It is just an archive of objects with unresolved references to other objects in your shared library. Not only would you have to explicitly link to the shared library, but if you compiled your program dynamically (the default), then the shared library would also have to be present on the target system in order to execute the program. I hope that answers your question.
Great video, the deepest and best explanation I ever seen. However, I have a question. When you explained about the files with objdump, they show that symbols (functions in this case), have a fix direction in memory, haven't they. So every time the program (process) in load into memory, symbols are held in the same memory location? What happend if those directions are occupied by another process?
Thanks for your comments. This is a good question; probably worthy of an entire video in response. Linux makes use of something called virtual memory management. From the processes point of view, it has access to the full process address space allocated to it. Another process would think it has access to the exact same address range. But these addresses do not translate directly to the physical RAM address space. The process memory is logically split into pages of a certain size and the physical RAM is also split into page frames of the same size. The Linux OS will map a process's page to a page frame in physical RAM. So, when the process is accessing a page in its own process memory space, the OS will look up the physical RAM page frame that it maps to and read the data from there. This process page to RAM page frame mapping is called a page table. Generally processes have to share the physical RAM. So, when a process needs data that is not currently in RAM, and assuming RAM is full, the OS will need to make space, so it will write data from a RAM page frame into a temporary non-volatile storage. Then it will repurpose that physical page frame for the new process. Most Linux systems will have a partition called the swap partition. This is where data from RAM is written to and read from in order help facilitate RAM sharing. There is a lot to this topic, but I hope this explanation helps you understand it. Please ask again if it is not clear. Here's a link with some info: tldp.org/LDP/sag/html/vm-intro.html#:~:text=Linux%20supports%20virtual%20memory%2C%20that,be%20used%20for%20another%20purpose. This link has a diagram which I think is useful (see Section 3.) tldp.org/LDP/tlk/mm/memory.html
Excelletn video, and hope you can make another video to explain more detail about how main find the real function of printf by plt, and more about the resolve function _dl_runtime_resolve
I appreciate your support. I'm a big advocate of free education and I don't want to limit access to only those that have money, so I chose this platform. I have considered starting up a Patreon, but I don't do enough content to warrant that. Perhaps in the future I will set one up. I do make some money off the ads. If you want to help, the best thing you can do is spread the word about my channel and encourage people to check it out.
No. In the case of static linking, the actual code for these functions is compiled directly into the binary and any calls to these functions are direct jumps to the addresses for that code. The loader is not involved and it doesn't use the global offset table (got) or procedure linkage table (plt).
@@embeddedarmdev perfect so , if I'm not wrong dynamic linking is more chalenging to detect and risky also because it involves a scan of the import table until the good function is found, Am I wrong ?
I'm not sure what you mean by challenging to detect. It's not too difficult to examine an ELF file and see which shared libraries it needs and uses. If we are talking about a dynamically-linked binary, then the function is normally resolved the first time it is called. Each of these dynamically-linked functions have an entry in the PLT and the GOT. It will first look in the PLT, which gets directed to the entry in the GOT. The first time it is called, the GOT entry points right back to a section in the PLT that invokes the dynamic loader, which is then responsible for locating the dynamic symbol (function). Once the location is resolved, the GOT entry is updated with this address so that the next time the function is called, it goes directly to it.
@@embeddedarmdev I'm seeing things from a windows perspective, so I guess It has some similitudes, I will check again your videos. PS; does the same behaviour occurs for a windows object file ? when linked or unlinked I mean? thx
Yes, I was specifically talking about how it works on Linux. I do think Windows works roughly the same, but I can't say exactly because I only specialize in Linux.
For the link process is relevant the order of -l params?. I had a project of c++ with cmake / qt and another libraries. With full static link. But when i change the order in cmake target_link_library, i get a lot of unresolv symbols error messsges from the linker. Only in one specific order in the static libraries list work. If you like i can share the git repo by this way.
Yes, it is relevant. You must include the -l option after any object files that need it. If the -l option occurs before a file that needs it, you will get a linker error. gcc file_1.c file_2.c -ltest file_3.c -o test In this example, if file_1 and file_2 link into the library (libtest), then there will be no problems. If file_3 also links into libtest, then you would get a linker error here. So make sure the -l option occurs after all files that need to link into it.
Cool. Just be aware that using --start-group can be very slow, if you are compiling a large project and use it a lot, it could significantly increase your compilation time.
Thanks for this video. Very clear explanation of static and dynamic linking. Question: how do I set up the gcc command line to just statically link one library with the rest of the libraries dynamically linked? For example, I want to statically link libxml2, but dynamically link the c library. This would handle the case where a target machine doesn't have libml2 but does have the c library.
You can use -Bstatic and -Bdynamic switches followed by the libraries you want it to affect. It will look something like this: gcc -o binary_name -Wl,Bstatic -lxml2 -Wl,Bdynamic Note that -Wl is a gcc option that passes an option to the linker (ld). So, this is effectively passing Bstatic and Bdynamic to ld for linking. I also don't think you need to specify libc on the command line, it should just figure it out. man7.org/linux/man-pages/man1/gcc.1.html man7.org/linux/man-pages/man8/ld.so.8.html linux.die.net/man/1/ld Hope this helps.
@@embeddedarmdev Thanks, but I'm seeing a whole bunch of undefined references in the xml2 linking. BTW I'm using -Wl,-Bstatic and -Wl,-Bdynamic On the other hand, If I just use -Bstatic and -Bdynamic everything links ok, but the resulting executable is the same size as when dynamic linking was used for libxml2. This is on Ubuntu 20.04 LTS. So I'll have to look at this in more detail.
@@jamesmerkel1411 If you just pass -Bstatic and -Bdynamic without the -Wl, then they will be interpreted as gcc options rather than passing them to the linker. I suspect that gcc doesn't recognize those options for your platform and ignores them, resulting in just a standard dynamically-linked binary. According to the gcc manual pages, those gcc options are only supported on VXWorks. However, if you use the -Wl option, then it should pass them to the linker no matter what. It is possible there is something wrong with the order of your gcc command. Could you post the exact full command you are using to compile?
@@embeddedarmdev The gcc command is: gcc myProg.c -o myProg $(xml2-config --cflags) -Wl,-Bstatic -lxml2 -Wl, Bdynamic -lm where xml2-config is a script included in the libxml2 distribution and the option --cflags evaluates to: cflags="-I${includedir}/libxml2 " One thing that occurred to me is that there is no static library in the Ubuntu installation of libxml2. Not sure where to check for that.
Your compile command looks fine to me. You are correct. If you specify static, then the linker will look for a static archive of that library (libxml2.a) and if the linker can't find it, it will throw an error. You might be able to download the source code and build it in a separate directory to see if it produces the static archive you need.
It would depend on what error it gave you. But I do know if you want to compile static, you need the static library (.a). You can't compile a static binary with a shared library (.so)
Sir, Do all the functions code that is in header files get copied during static linking or only those functions which are called in our program get copied?
No. When you compile statically, only the objects that the target binary needs will get compiled in. For example, if you use the printf() function from libc, then only printf (and any other functions it might need) will get compiled into your static binary. Anything else that is in libc would be left out.
question: the maths library -- description did not have an interpretor entry, where as C-lib so had one. Both are .so. So is this interpretor entry in .so an optional thing?
So let me clarify some terminology. The interpreter is the the program that is responsible for loading the program into memory as well as locating any shared libraries the program needs and linking them into the program's memory. The program will only have one interpreter entry. It will only need an interpreter if it is dynamically linked (using shared libraries). If it is statically linked, then it doesn't need the interpreter. In this video, the loader is the ld-linux-x86-64.so.2. If you go to 16:24, you can see it listed in the output of the file command. Now, if you go to 18:53, then you'll see part of the output of the readelf command that shows the shared libraries that the binary needs. You see both listed libc and libm. They both have .so.6 extensions. These are the libraries that the loader will look for and link into the program's memory. At 19:05, the interpreter (loader) is shown again. I hope this helps. Let me know if it still isn't clear.
Thanks for interesting video. I have one question. How does dynamic linker know if the shared library, which is used by the process which is being started, is already loaded somewhere into the memory (because other already running process is using it), or it is not yet in the memory? Does dynamic linker have some internal structure that it uses to record which shared library was already loaded to avoid attempt of loading it again? How this part work?
Yes, the dynamic linker/loader keeps track of which shared libraries are already loaded into memory. On Linux, this is normally some version of the ELF interpreter ld-linux.so. It is a complex program that is actually a shared library itself. It gets mapped into a process' address space when the process is started. It is responsible for loading the shared library into memory if it is not already loaded, mapping the library into the process' address space, and resolving library symbols for the process. The dynamic linker/loader keeps track of how many processes are using a shared library by using a counter. The counter is incremented or decremented to keep track of how many processes are using the library. If the counter reaches 0, then the shared library can be unloaded from memory. The counter is likely part of an internal data structure, but I am not sure what that data structure looks like. I hope this answers your question.
@@embeddedarmdev yes, you answered my question and provided even more details I expected, thanks for that! I dig a little bit into this and found l_direct_opencount member inside struct link_map (file link.h from glibc) but it seems to me that it is being used only in case of direct call of dlopen from the program. I run simple test and when I called function from shared library using dlopen, dlsym, only then I got the output like "opening file=libdynamic.so [0]; direct_opencount=1", after calling LD_DEBUG=all ./a.out 2>&1 | grep count. When the program was compiled without dl functions, but instead with -ldynamic flag, this output was not visible. Furthermore, when I run test program many times in the background, for each execution I got direct_opencount value = 1, and when program finished it was reported as 0, so looks like this flag is set per process. Definitely will have to investigate this further. When do you plan to release the video related to undefined symbols resolution (GOT, PLT, etc.)?
That is an interesting experiment. You'll have to let me know if you find out anything more. I recall looking at the code for the dynamic linker sometime ago and I am pretty sure I saw that it has it's own modified version of dlopen (dl-open I think) so there would be a few differences in how it works compared with the standard dlopen. I am not 100% sure, but I suspect that each process gets its own link_map and l_direct_opencount, so that would explain why each process was only reporting direct_opencount value = 1. I do have plans to do a video on GOT/PLT in the near future. I'm going to start a new series on ELF files and from there go into some basic linux binary analysis, which is when I would talk about the GOT/PLT in detail. I expect that will be sometime in the next few months.
gcc automatically does it for you. Most programs will require a standard library, so it just makes it easier to have it done by default. You can prevent it from linking with the standard libc using the -nostdlib or -nolibc options, but then you would have to provide a library for any functions that are normally provided by libc (like printf).
Let's say you are trying to compile a program for two different Linux distros, Ubuntu 20.04 and Manjaro 20.2. And let's say the program is dynamically linked to a library that has a different name and location on each distro. How exactly do you specify the name and location when you are compiling for each distro?
This will generally be resolved by the loader itself on the target system. As far as the location, the loader will search a set of standard directories to find the library. As long as the library is in one of those directories, it will find it. I don't know why a library would have a different name on two different distros. Can you give me an example of this? Normally shared libraries have a common name. Linux uses linker scripts and soname to make it so we can specify a generic library name at compile time and then the loader will resolve that to a specific version. For example, if you link into the standard library (libc), then when it is compiled, it will tell it to look for a library called libc.so.6. This file is a symbolic link and might point to libc-2.23.so which is the actual shared library. If this shared library were to be upgraded to 2.24, then the libc.so.6 symbolic link would be updated to point to the new version. When the program is loaded, it would request libc.so.6 and get pointed to the new version.
Great video! I have a project using libusb-1.0. But after adding -no-pie to the compiling flag, I used your command to check $ readelf -a myapp | grep Shared 0x0000000000000001 (NEEDED) Shared library: [libusb-1.0.so.0] libusb is still dynamically linked, not static. Do you know why? Thanks!
@@embeddedarmdev Thanks so much for replying to my question! Actually after I watched your other video, I added -static option. But it created some undefined reference. Does it mean .so(libusb-1.0.so.0) and .a(libusb-1.0.a) have different contents?
The contents are fundamentally the same, as in they basically contain the same object code. However, they are structured and used differently. The shared object (libusb-1.0.so.0) is designed to be loaded into memory and shared with any program that needs it. The static library (libusb-1.0.a) can't be loaded into memory, it is just a static archive of some object code. When you use the -static option, the linker will look for the static library and not the shared library. To the best of my knowledge, you can't link statically using the shared object (.so). If you want to be able to link statically, you'll need the static archive version of your library. Unfortunately, I was not clear about that in any of my videos. The reason it works in my videos is because I am using standard libraries. For the standard libraries, there is both a shared object and a static archive located in the /lib directory. For example, there is both a libc.a and a libc.so. When you specify the -static option, the linker uses the static archive (libc.a). I am assuming that you don't have the libusb-1.0.a accessible to the linker and this is why you are getting the undefined reference error. If that is not the case, then we'd need to look further to determine why you are getting the error. I hope this answers your question.
@@embeddedarmdev Thanks agin! Yes, you are right, .so and .a should be the same. But in order to link .a, it needs other static libs. My current build likes this: g++ -o libusbtest libusb.o -g -Wall -std=c++11 -lusb-1.0 -lpthread -ludev -static It generated error below: /usr/bin/ld: cannot find -ludev I updated libudev, but only saw /lib/x86_64-linux-gnu/libudev.so.1, but no libudev.a? I have no problem to run this g++ -o libusbtest libusb.o -g -Wall -std=c++11 -lusb-1.0. It uses the shared lib.
Yes, it is probably looking for a udev.a file and not finding it. You can do a mix of static and dynamic using something like this: g++ -o libusbtest libusb.o -g -Wall -std=c++11 -Wl,-Bstatic -lpthread -Wl,-Bdynamic -lusb-1.0 -ludev This should compile in pthread statically (your system should have a libpthread.a) and then libusb and libudev would be compiled in dynamically. However, I have never actually tried this before. On my system, there is no static library for libudev or libusb. This may be by design. It is possible that the shared library contains a lot of system level code that is designed to only have one instance running and running an executable that had the same usb code statically compiled could potentially cause conflicts with the already running shared library; but that is just a guess on my part.
Maybe the best video about library linking over Linux I've seen in RUclips
Amazing explanation and demonstration with enough detail to have semi low level look on how linking takes place!
This is by far the best linker video! Kudos to you!
Thanks! I appreciate it!
Straight to the point with enough detail needed to understand ! Great !
Thank you. I'm glad you found it useful.
I have seen many books and blogs about dll. This video illustrates the best.
This video is gold.
Thanks a lot for making this video. Very detailed and precise at the same time. The best video one can create to explain Static and dynamic linking in detail.
Thanks! I'm glad you found it helpful.
Your vids are gold for C newbs like myself. Thank you sir!
Sir your content is amazing. I planned to watch all your playlist tonight
Clean and clear, thank you. This is exactly what I've been looking for
This is the most helpful video I've watched on the subject. It's helped me a lot thanks.
Thanks. I'm glad it was helpful. This seems to be my most popular video so far.
Excellent and well-presented video. Please upload more soon.
This is really great, so clear and the help of small diagrams were really helpful. Even my university cannot do a better work in making student learn as you. thanks a lot for your work. Hope you will upload more soon
really helpful.. guys you will get xtra knowledge
That's actually very well explained, thank you!
Super good, deep and understandable
Great Job, please keep it up, practical demo was awesome
extremely useful sir, thanks for sharing your knowledge
Incredibly helpful! Just what I needed, great video!
Thank you. I'm glad you found it helpful.
@@embeddedarmdev I have a question.
If I create static library that has to be linked against some libraries in my system (*.so), when I later want to use an executable that links against the created library, do I also have to explicitly link the executable with said system libraries, or is the information stored inside (i.e. my_lib.a)?
@@miguelg.s.1246 This is an interesting question. Just to be sure I understand what you are asking. You have on your system a shared library (let's call is libtest.so). You create a static library called my_lib.a that links into libtest.so. Now, you want to create a program (let's call it my_app) in which you want to use functions from your static library my_lib.a (which link into the shared library). And you want to know if when you compile my_app.c would you have to link into the shared object.
If that is your question, the answer is yes, you would have to explicitly link with the library libtest.so when you compile my_app.c. The static library (.a) is simply just a collection of compiled objects, there is no linking information in it. When you create the archive, you are not actually linking the archive with the shared object. It is just an archive of objects with unresolved references to other objects in your shared library.
Not only would you have to explicitly link to the shared library, but if you compiled your program dynamically (the default), then the shared library would also have to be present on the target system in order to execute the program.
I hope that answers your question.
@@embeddedarmdev Once again great explanation and thank you, you cleared all my doubts!
very clear and easy to understand video thank you
Thank you!
High quality educational vid, thank you sir.
Great video 👍. Very informative.
so usefull and easy to understand !! great job
Quite a good video! Thanks!
Excellent tutorial
Thank you so much...It was vry helpful!
Thank you. I'm glad is was helpful
Very well done, thank you so much.
Amazing video thank you very much
Great video, the deepest and best explanation I ever seen. However, I have a question. When you explained about the files with objdump, they show that symbols (functions in this case), have a fix direction in memory, haven't they. So every time the program (process) in load into memory, symbols are held in the same memory location? What happend if those directions are occupied by another process?
Thanks for your comments.
This is a good question; probably worthy of an entire video in response.
Linux makes use of something called virtual memory management. From the processes point of view, it has access to the full process address space allocated to it. Another process would think it has access to the exact same address range. But these addresses do not translate directly to the physical RAM address space. The process memory is logically split into pages of a certain size and the physical RAM is also split into page frames of the same size. The Linux OS will map a process's page to a page frame in physical RAM. So, when the process is accessing a page in its own process memory space, the OS will look up the physical RAM page frame that it maps to and read the data from there.
This process page to RAM page frame mapping is called a page table.
Generally processes have to share the physical RAM. So, when a process needs data that is not currently in RAM, and assuming RAM is full, the OS will need to make space, so it will write data from a RAM page frame into a temporary non-volatile storage. Then it will repurpose that physical page frame for the new process.
Most Linux systems will have a partition called the swap partition. This is where data from RAM is written to and read from in order help facilitate RAM sharing.
There is a lot to this topic, but I hope this explanation helps you understand it. Please ask again if it is not clear.
Here's a link with some info:
tldp.org/LDP/sag/html/vm-intro.html#:~:text=Linux%20supports%20virtual%20memory%2C%20that,be%20used%20for%20another%20purpose.
This link has a diagram which I think is useful (see Section 3.)
tldp.org/LDP/tlk/mm/memory.html
Nice video. Keep posting!
This is a great video. Good work. Hail Lobster 🦞.
Excelletn video, and hope you can make another video to explain more detail about how main find the real function of printf by plt, and more about the resolve function _dl_runtime_resolve
Thank you. This is a topic I intend to do in the near future.
Though I am a student, how can I support your course by a little financial amount. It's soooo good, I don't wanna learn for free!
I appreciate your support. I'm a big advocate of free education and I don't want to limit access to only those that have money, so I chose this platform. I have considered starting up a Patreon, but I don't do enough content to warrant that. Perhaps in the future I will set one up. I do make some money off the ads. If you want to help, the best thing you can do is spread the word about my channel and encourage people to check it out.
10/10 video
When static is the object file calling a pointer for the entry point of puts, time and so one?
No. In the case of static linking, the actual code for these functions is compiled directly into the binary and any calls to these functions are direct jumps to the addresses for that code. The loader is not involved and it doesn't use the global offset table (got) or procedure linkage table (plt).
@@embeddedarmdev perfect so , if I'm not wrong dynamic linking is more chalenging to detect and risky also because it involves a scan of the import table until the good function is found, Am I wrong ?
I'm not sure what you mean by challenging to detect. It's not too difficult to examine an ELF file and see which shared libraries it needs and uses.
If we are talking about a dynamically-linked binary, then the function is normally resolved the first time it is called. Each of these dynamically-linked functions have an entry in the PLT and the GOT. It will first look in the PLT, which gets directed to the entry in the GOT. The first time it is called, the GOT entry points right back to a section in the PLT that invokes the dynamic loader, which is then responsible for locating the dynamic symbol (function). Once the location is resolved, the GOT entry is updated with this address so that the next time the function is called, it goes directly to it.
@@embeddedarmdev I'm seeing things from a windows perspective, so I guess It has some similitudes, I will check again your videos. PS; does the same behaviour occurs for a windows object file ? when linked or unlinked I mean? thx
Yes, I was specifically talking about how it works on Linux. I do think Windows works roughly the same, but I can't say exactly because I only specialize in Linux.
Very good video
For the link process is relevant the order of -l params?. I had a project of c++ with cmake / qt and another libraries. With full static link. But when i change the order in cmake target_link_library, i get a lot of unresolv symbols error messsges from the linker. Only in one specific order in the static libraries list work. If you like i can share the git repo by this way.
Yes, it is relevant. You must include the -l option after any object files that need it. If the -l option occurs before a file that needs it, you will get a linker error.
gcc file_1.c file_2.c -ltest file_3.c -o test
In this example, if file_1 and file_2 link into the library (libtest), then there will be no problems. If file_3 also links into libtest, then you would get a linker error here.
So make sure the -l option occurs after all files that need to link into it.
I resolved the problem using the linux linker option -Wl, --start-group. With this options the order is irrelevant always work.
Cool. Just be aware that using --start-group can be very slow, if you are compiling a large project and use it a lot, it could significantly increase your compilation time.
Thanks for this video. Very clear explanation of static and dynamic linking. Question: how do I set up the gcc command line to just statically link one library with the rest of the libraries dynamically linked? For example, I want to statically link libxml2, but dynamically link the c library. This would handle the case where a target machine doesn't have libml2 but does have the c library.
You can use -Bstatic and -Bdynamic switches followed by the libraries you want it to affect. It will look something like this:
gcc -o binary_name -Wl,Bstatic -lxml2 -Wl,Bdynamic
Note that -Wl is a gcc option that passes an option to the linker (ld). So, this is effectively passing Bstatic and Bdynamic to ld for linking.
I also don't think you need to specify libc on the command line, it should just figure it out.
man7.org/linux/man-pages/man1/gcc.1.html
man7.org/linux/man-pages/man8/ld.so.8.html
linux.die.net/man/1/ld
Hope this helps.
@@embeddedarmdev Thanks, but I'm seeing a whole bunch of undefined references in the xml2 linking. BTW I'm using -Wl,-Bstatic and -Wl,-Bdynamic On the other hand, If I just use -Bstatic and -Bdynamic everything links ok, but the resulting executable is the same size as when dynamic linking was used for libxml2. This is on Ubuntu 20.04 LTS. So I'll have to look at this in more detail.
@@jamesmerkel1411 If you just pass -Bstatic and -Bdynamic without the -Wl, then they will be interpreted as gcc options rather than passing them to the linker. I suspect that gcc doesn't recognize those options for your platform and ignores them, resulting in just a standard dynamically-linked binary. According to the gcc manual pages, those gcc options are only supported on VXWorks. However, if you use the -Wl option, then it should pass them to the linker no matter what. It is possible there is something wrong with the order of your gcc command. Could you post the exact full command you are using to compile?
@@embeddedarmdev The gcc command is:
gcc myProg.c -o myProg $(xml2-config --cflags) -Wl,-Bstatic -lxml2 -Wl, Bdynamic -lm
where xml2-config is a script included in the libxml2 distribution and the option --cflags evaluates to:
cflags="-I${includedir}/libxml2 "
One thing that occurred to me is that there is no static library in the Ubuntu installation of libxml2.
Not sure where to check for that.
Your compile command looks fine to me.
You are correct. If you specify static, then the linker will look for a static archive of that library (libxml2.a) and if the linker can't find it, it will throw an error.
You might be able to download the source code and build it in a separate directory to see if it produces the static archive you need.
There is(maybe was) a bug in Libmagic which shows PIE as shared lib
Great video. For me, -static gave me linking errors. What would be the reason?
It would depend on what error it gave you. But I do know if you want to compile static, you need the static library (.a). You can't compile a static binary with a shared library (.so)
Sir,
Do all the functions code that is in header files get copied during static linking or only those functions which are called in our program get copied?
No. When you compile statically, only the objects that the target binary needs will get compiled in. For example, if you use the printf() function from libc, then only printf (and any other functions it might need) will get compiled into your static binary. Anything else that is in libc would be left out.
@@embeddedarmdev thank you sir
You are welcome.
question: the maths library -- description did not have an interpretor entry, where as C-lib so had one. Both are .so. So is this interpretor entry in .so an optional thing?
So let me clarify some terminology. The interpreter is the the program that is responsible for loading the program into memory as well as locating any shared libraries the program needs and linking them into the program's memory. The program will only have one interpreter entry. It will only need an interpreter if it is dynamically linked (using shared libraries). If it is statically linked, then it doesn't need the interpreter. In this video, the loader is the ld-linux-x86-64.so.2. If you go to 16:24, you can see it listed in the output of the file command.
Now, if you go to 18:53, then you'll see part of the output of the readelf command that shows the shared libraries that the binary needs. You see both listed libc and libm. They both have .so.6 extensions. These are the libraries that the loader will look for and link into the program's memory.
At 19:05, the interpreter (loader) is shown again.
I hope this helps. Let me know if it still isn't clear.
nice 👍👌
Thanks for interesting video. I have one question. How does dynamic linker know if the shared library, which is used by the process which is being started, is already loaded somewhere into the memory (because other already running process is using it), or it is not yet in the memory? Does dynamic linker have some internal structure that it uses to record which shared library was already loaded to avoid attempt of loading it again? How this part work?
Yes, the dynamic linker/loader keeps track of which shared libraries are already loaded into memory. On Linux, this is normally some version of the ELF interpreter ld-linux.so. It is a complex program that is actually a shared library itself. It gets mapped into a process' address space when the process is started. It is responsible for loading the shared library into memory if it is not already loaded, mapping the library into the process' address space, and resolving library symbols for the process. The dynamic linker/loader keeps track of how many processes are using a shared library by using a counter. The counter is incremented or decremented to keep track of how many processes are using the library. If the counter reaches 0, then the shared library can be unloaded from memory. The counter is likely part of an internal data structure, but I am not sure what that data structure looks like.
I hope this answers your question.
@@embeddedarmdev yes, you answered my question and provided even more details I expected, thanks for that!
I dig a little bit into this and found l_direct_opencount member inside struct link_map (file link.h from glibc) but it seems to me that it is being used only in case of direct call of dlopen from the program. I run simple test and when I called function from shared library using dlopen, dlsym, only then I got the output like "opening file=libdynamic.so [0]; direct_opencount=1", after calling LD_DEBUG=all ./a.out 2>&1 | grep count. When the program was compiled without dl functions, but instead with -ldynamic flag, this output was not visible. Furthermore, when I run test program many times in the background, for each execution I got direct_opencount value = 1, and when program finished it was reported as 0, so looks like this flag is set per process. Definitely will have to investigate this further.
When do you plan to release the video related to undefined symbols resolution (GOT, PLT, etc.)?
That is an interesting experiment. You'll have to let me know if you find out anything more. I recall looking at the code for the dynamic linker sometime ago and I am pretty sure I saw that it has it's own modified version of dlopen (dl-open I think) so there would be a few differences in how it works compared with the standard dlopen.
I am not 100% sure, but I suspect that each process gets its own link_map and l_direct_opencount, so that would explain why each process was only reporting direct_opencount value = 1.
I do have plans to do a video on GOT/PLT in the near future. I'm going to start a new series on ELF files and from there go into some basic linux binary analysis, which is when I would talk about the GOT/PLT in detail. I expect that will be sometime in the next few months.
why don't we need to link the libc ... like -lc or something?
gcc automatically does it for you. Most programs will require a standard library, so it just makes it easier to have it done by default. You can prevent it from linking with the standard libc using the -nostdlib or -nolibc options, but then you would have to provide a library for any functions that are normally provided by libc (like printf).
@@embeddedarmdev Oh ! Got it, thank you! 😃👍
Let's say you are trying to compile a program for two different Linux distros, Ubuntu 20.04 and Manjaro 20.2. And let's say the program is dynamically linked to a library that has a different name and location on each distro. How exactly do you specify the name and location when you are compiling for each distro?
This will generally be resolved by the loader itself on the target system. As far as the location, the loader will search a set of standard directories to find the library. As long as the library is in one of those directories, it will find it.
I don't know why a library would have a different name on two different distros. Can you give me an example of this? Normally shared libraries have a common name. Linux uses linker scripts and soname to make it so we can specify a generic library name at compile time and then the loader will resolve that to a specific version.
For example, if you link into the standard library (libc), then when it is compiled, it will tell it to look for a library called libc.so.6. This file is a symbolic link and might point to libc-2.23.so which is the actual shared library. If this shared library were to be upgraded to 2.24, then the libc.so.6 symbolic link would be updated to point to the new version. When the program is loaded, it would request libc.so.6 and get pointed to the new version.
I love u
Great video! I have a project using libusb-1.0. But after adding -no-pie to the compiling flag, I used your command to check
$ readelf -a myapp | grep Shared
0x0000000000000001 (NEEDED) Shared library: [libusb-1.0.so.0]
libusb is still dynamically linked, not static. Do you know why? Thanks!
The nopie flag just specifies not to use position-independent execution. If you want static, include the -static option.
@@embeddedarmdev Thanks so much for replying to my question! Actually after I watched your other video, I added -static option. But it created some undefined reference. Does it mean .so(libusb-1.0.so.0) and .a(libusb-1.0.a) have different contents?
The contents are fundamentally the same, as in they basically contain the same object code. However, they are structured and used differently. The shared object (libusb-1.0.so.0) is designed to be loaded into memory and shared with any program that needs it. The static library (libusb-1.0.a) can't be loaded into memory, it is just a static archive of some object code.
When you use the -static option, the linker will look for the static library and not the shared library. To the best of my knowledge, you can't link statically using the shared object (.so). If you want to be able to link statically, you'll need the static archive version of your library. Unfortunately, I was not clear about that in any of my videos.
The reason it works in my videos is because I am using standard libraries. For the standard libraries, there is both a shared object and a static archive located in the /lib directory. For example, there is both a libc.a and a libc.so. When you specify the -static option, the linker uses the static archive (libc.a).
I am assuming that you don't have the libusb-1.0.a accessible to the linker and this is why you are getting the undefined reference error. If that is not the case, then we'd need to look further to determine why you are getting the error.
I hope this answers your question.
@@embeddedarmdev Thanks agin! Yes, you are right, .so and .a should be the same. But in order to link .a, it needs other static libs. My current build likes this:
g++ -o libusbtest libusb.o -g -Wall -std=c++11 -lusb-1.0 -lpthread -ludev -static
It generated error below:
/usr/bin/ld: cannot find -ludev
I updated libudev, but only saw /lib/x86_64-linux-gnu/libudev.so.1, but no libudev.a?
I have no problem to run this g++ -o libusbtest libusb.o -g -Wall -std=c++11 -lusb-1.0. It uses the shared lib.
Yes, it is probably looking for a udev.a file and not finding it. You can do a mix of static and dynamic using something like this:
g++ -o libusbtest libusb.o -g -Wall -std=c++11 -Wl,-Bstatic -lpthread -Wl,-Bdynamic -lusb-1.0 -ludev
This should compile in pthread statically (your system should have a libpthread.a) and then libusb and libudev would be compiled in dynamically. However, I have never actually tried this before.
On my system, there is no static library for libudev or libusb. This may be by design. It is possible that the shared library contains a lot of system level code that is designed to only have one instance running and running an executable that had the same usb code statically compiled could potentially cause conflicts with the already running shared library; but that is just a guess on my part.