I've been coding for 6 years and I've used bitfields before (as an API consumer, not as an implementer) but didn't know how they worked. Now I feel I understand. Thanks for providing this clear and concise intro to them!
Great videos. I'd like to add a little something from my own experience with struct bit-field. Be careful with struct bit-field, it looks like a good idea when dealing with hardware registers or exchange protocol which are most of the time (all the time ?) in form of an CPU word whose each bit as a special meaning. But compiler padding, endianness and register access could make things very non-intuitive. I hit this wall the hard way, loosing days to figure out why my serial link was not working ^^. The best way to understand how a processor handles bit-field is to check its ABI (one of the least known and most important document about processor in my opinion). Additionnal: the packed attribute tends to remove compiler memory aligment assumption that could transform a simple word access into several one byte access, hence saving 2 bytes from memory could cost you some CPU cycle, that might or might not be a problem, the trade-off is up to you.
Just use *pahole* to make sure the packing of structs is adequate and optimal, this may also reduce cache misses and memory mis-alignment. Compilers are not very smart at optimizing structure packing and padding therefore using the aforementioned static analysis tool, can potentially save hundreds of MG or even GB's of memory and thus potentiallyCPU cycles if you rearrange the members properly, on the long run of course. As for endianess, unless working with very proprietary or esoteric hardware (or tool chain even) byte endianess shouldn't be an issue anymore, most (if not all) low-level server side modules already have addressed this by standardizing and implementing abstractions to detect and convert between modes back and forth, unless the network protocol is being made from scratch, instead of using an already proven one (which you should only do for educational purposes anyway) then running into endianess hell is almost 100% guaranteed, specially if other devices on the network aren't aware of each other and their endianess-coping mechanisms differ considerably. In simpler words: It would be like expecting an English speaker understand a Mandarin speaker just because they can send a stream of vibrations to each other, without them getting prior knowledge of the other peer's language lexical structure, although they might get some 'sounds' right and the other will understand them, most of them will not be grammatically and syntactically correct, which is what byte endianess is to a processor, and will not be understood correctly or at all. Greetings and very good advice for those beginning to lurk on the depths of networking protocols and low level implementation.
Thank you for putting these videos out here on youtube. I have learned so much from your videos. You manage to put so much information in such a short video, and still have it be very clear and concise.
If you do use these macros, make sure to wrap the variables inside the macro definition with parentheses: e.g. `#define SET_BIT(BF, N) (BF) |= (0x0001ULL
@@charankoppineni4498 for (BF >> N) & 1, BF >> N can not be known at compile time in this usecase since BF might be anything, so it has to do this calculation at runtime. doing BF & (1
@@charankoppineni4498 in this case, its up to the compiler to decide what to do. it could inline the entire thing and just leave out the loop entirely (which would just make it look like a series of checks instead of a loop in the machine code). but yeah in general you're right, N could be variable as well, it is just not as likely as BF being one though. you should note though that this optimization is very useless today since we have fast computers nowadays and i just wanted to comment it to make people understand small optimizations like this more
I'd not seen that before either. After reading the replies and doing some digging, I found this if it helps someone else: www.tutorialspoint.com/cprogramming/c_bit_fields.htm
With struct of bit-fields, one thing to watch for is that the layout of the bits is compiler / CPU architecture dependent: sometimes the first bit is the high bit and sometimes the low bit: #include #include union { uint32_t value; struct { uint32_t a:2; uint32_t b:2; uint32_t c:2; uint32_t d:2; }; } var; int main(int argc, char* argv[]) { var.value = 0; var.a = 3; printf("%08x ", var.value); } On my Linux/Intel system, this will print out 00000003. However, on other systems it could print out c0000000 (and since I don't have handy access to one, I wasn't able to double check; Solaris on Sparc comes to mind as a probable different bit-packer.)
I've been having a hard time to understand how fd_set works while working with accept() function, I couldn't wrap my head around how can they fit many different file descriptors into let's say one singular integer, great video! Thank you so much for your efforts as your videos helped me during all my learning journey.
Maybe too advanced: I like to see a video about code portability: For example, bit-field are a wonderful idea to implement protocols in embedded, until you encounter a MCU with different Endianness.
What kind of video would that be? A bitfield cannot tell you if it needs endianness conversion. You have to use other means to figure it out. Perhaps a network packet header. Perhaps a cross-compiling toolchain. Not really a bitfield-related problem.
You can use bitshifting and bitwise operators on the number 1 to check endiannes, write it out on paper and think about how to do that. Once you determine that, you can determine how to convert them to the endianness you need, with more bit shifting operations.
4:28 that ampersand (&) symbol blows my mind Usually I have to write like (((options >> 3) & 0x1) == 0x1) and like wow... there is much simple way to do a same thing 😅
Hey Jacob! Thanks for the amazing vid, but as another user, there are some topics i find a lil advanced. Can you do a video talking about macros? Thanks again for your stuff :)
Fantastic stuff! I’m definitely going to start using these. One question: Is there an advantage to using preprocessor #define over using enum for giving names to bit fields?
Hey, thanks for this excellent and very clear video. I just have one question : I understand what you're doing in scenario 2 but I don't understand how it relates to a bitfield. I see you're creating a struct with ints of reduced size, but where is the bitfield ?
Isn't integer overflow UB? I would be careful about saying "it will eventually become negative." I believe __attribute__((packed)) is a non-standard compiler add-on. I would also wrap all the variables in your macros with parentheses just in case you get some operator precedence shenanigans.
It's neat. I wonder if it's how APL works under the hood when creating filters. You can imagine it being useful to represent indicies in an array, but probably too much hassel and maybe ultimately slower.
Great video explaining this concept. Thanks @Jacob for your video series. Comment on example 3 printing, the pattern it prints in your for loop is reverse of how the actual bits are stored in memory conceptually. What if we do: for(int i=64; i>0; i--){ //check if bit is set and print '+' or '.' } this should print the bits in the LSB to MSB order. Just thought I ask.
for(i=63;i>=0;i++) because the IS_SET_BIT macro is defined such a way (counting from 0). LSB and MSB doesn't apply here because there are bits, NOT Bytes.
great video, but I was thinking the whole time, when should I use flags and when bitfields? What is the performance and portability issue if I use bitfields with 1 bit-sizes?
A portability issue is whether the bit-field is filled in from least-significant-bit to most-significant or vice-versa. Consider: union { uint32_t value; struct { uint32_t a:2; uint32_t b:30; }; } var; On Linux/Intel, var.a takes the least significant bits of value. On Sun/Sparc they [probably] take the most significant bits (don't have a system handy to test it with). I have seen a bit of code that uses such coding-structures when manipulating standards based formats that have their origins on networks. Usually the structures have conditional compilation depending on the whether or not the LSb or MSb are used first.
technically speaking (or from language feature perspective), scenario 1 & 3 are not bit fields. They're just normal bit fiddling on a given bit pattern that is of type int
Yes, it is. A pretty good discussion can be found here. I'm also assuming it applies to bit-fields. www.gnu.org/software/autoconf/manual/autoconf-2.63/html_node/Integer-Overflow-Basics.html
@@JacobSorber That's... less decisive than I had hoped. I also tried searching this up myself and it seems like no one acknowledges the existence of signed bitfields, much less overflow behavior. Of course it works, it is hard to imagine how you could implement a compiler with any other behavior, but nowadays it is important to make sure it's also in the standard lest your compiler starts deleting code like a madman.
I personally don't think there's a need for an option to have multiple set bits. It might end up messing up with other options if it is bitwise-OR'd with them
How about one video of reading millions of integers from a file into a memory and do some processing on them. For example, an integer number representing one bit of a big giant array... it's a commonly asked interview question and I get stuck everytime 😥
It was when I conceptually understood the program counter and its relationship with the program status word and its relationship with the actual binary instruction that I felt I really understood exactly how the computer did what it did. I'm concerned we've become far too abstracted from the hardware and understanding how it does its work.
Hi, I just started learning Java and how to use it's apis. But I have no idea how processor and binary instructions work. Can you direct me how to learn all these concepts...
@@newjade6075 it was in a systems analysis class after I started out by learning IBM 360/370 Macro Assembly Language and then COBOL in 1980 that I got a clear idea of the binary instruction hitting a recognizer of some sort and the specific bit of the instruction (the op code) actually triggering flags in the “program Status Word”: carry, negative, zero, branch, etc., and the computer responding simply according to how the flags were set. When I looked closely at the bit patterns of the op codes I noticed that the math commands were very similar with only a few bits different, so they always triggered the same flags except a couple depending what the actual operation was. Likewise with the move, compare, branch and other groups of instructions. A clear similarity of the bit patterns with only minor differences. I could almost see how the actual bits in a machine language command actually triggered the appropriate flags and thus the response by the computer. It was like a vision in my head, hard to describe in words but suddenly it all made sense how a specific binary code would trigger a specific response from a computer. After that all programming made perfect sense, the only problem I say was that people always try to abstract themselves away from this basic truth. I prefer to be right down there on the iron.
@@newjade6075 The book "Computer Organisation and Design - the Hardware/Software Interface" by Patterson and Hennessey is an awesome book to understand the same. To understand how the processor is logically implemented using digital gates- "Digital Design" by Morris Mano. The second book in hardware oriented.
Don't EVER use the bit field notation starting at 7:00 into the video, for C/C++. That is processor and compiler dependent, and can get you into trouble. How it's defined under the hood, is not defined in the spec and can/will be different between processors and compilers.
I've been coding for 6 years and I've used bitfields before (as an API consumer, not as an implementer) but didn't know how they worked. Now I feel I understand. Thanks for providing this clear and concise intro to them!
Glad I could help!
Great videos. I'd like to add a little something from my own experience with struct bit-field.
Be careful with struct bit-field, it looks like a good idea when dealing with hardware registers or exchange protocol which are most of the time (all the time ?) in form of an CPU word whose each bit as a special meaning.
But compiler padding, endianness and register access could make things very non-intuitive. I hit this wall the hard way, loosing days to figure out why my serial link was not working ^^.
The best way to understand how a processor handles bit-field is to check its ABI (one of the least known and most important document about processor in my opinion).
Additionnal: the packed attribute tends to remove compiler memory aligment assumption that could transform a simple word access into several one byte access, hence saving 2 bytes from memory could cost you some CPU cycle, that might or might not be a problem, the trade-off is up to you.
Just use *pahole* to make sure the packing of structs is adequate and optimal, this may also reduce cache misses and memory mis-alignment. Compilers are not very smart at optimizing structure packing and padding therefore using the aforementioned static analysis tool, can potentially save hundreds of MG or even GB's of memory and thus potentiallyCPU cycles if you rearrange the members properly, on the long run of course.
As for endianess, unless working with very proprietary or esoteric hardware (or tool chain even) byte endianess shouldn't be an issue anymore, most (if not all) low-level server side modules already have addressed this by standardizing and implementing abstractions to detect and convert between modes back and forth, unless the network protocol is being made from scratch, instead of using an already proven one (which you should only do for educational purposes anyway) then running into endianess hell is almost 100% guaranteed, specially if other devices on the network aren't aware of each other and their endianess-coping mechanisms differ considerably.
In simpler words:
It would be like expecting an English speaker understand a Mandarin speaker just because they can send a stream of vibrations to each other, without them getting prior knowledge of the other peer's language lexical structure, although they might get some 'sounds' right and the other will understand them, most of them will not be grammatically and syntactically correct, which is what byte endianess is to a processor, and will not be understood correctly or at all.
Greetings and very good advice for those beginning to lurk on the depths of networking protocols and low level implementation.
Thank you for putting these videos out here on youtube. I have learned so much from your videos.
You manage to put so much information in such a short video, and still have it be very clear and concise.
I work in embedded and I regularly use bit fields for structs, however SET_BIT & CLR_BIT macros were completely new to me. Thanks man!
If you do use these macros, make sure to wrap the variables inside the macro definition with parentheses: e.g. `#define SET_BIT(BF, N) (BF) |= (0x0001ULL
Wow! You definitely should not. Embedded world is where this can trip you up the most. Since it's not defined behavior/layout under the hood.
@@TurboXray it depends on the compiler, for the mcu compiler that i am using it is documented to be supported.
Good Video, one small optimization at 11:57 would be doing (BF & (1
what?
@@charankoppineni4498 for (BF >> N) & 1, BF >> N can not be known at compile time in this usecase since BF might be anything, so it has to do this calculation at runtime.
doing BF & (1
@@inferno3853 but the value of N keeps changing as the loop get iterates right ? Even then, it is known to the compiler?
@@charankoppineni4498 in this case, its up to the compiler to decide what to do. it could inline the entire thing and just leave out the loop entirely (which would just make it look like a series of checks instead of a loop in the machine code).
but yeah in general you're right, N could be variable as well, it is just not as likely as BF being one though.
you should note though that this optimization is very useless today since we have fast computers nowadays and i just wanted to comment it to make people understand small optimizations like this more
what is the ':' operator? Great video as usual
The number of bits reserved for the variabele in the field
@@JoQeZzZ Cool thank you :D
I'd not seen that before either. After reading the replies and doing some digging, I found this if it helps someone else: www.tutorialspoint.com/cprogramming/c_bit_fields.htm
@@shawnmatyasovszky7994 sweeeet ill read up thanks
With struct of bit-fields, one thing to watch for is that the layout of the bits is compiler / CPU architecture dependent: sometimes the first bit is the high bit and sometimes the low bit:
#include
#include
union {
uint32_t value;
struct {
uint32_t a:2;
uint32_t b:2;
uint32_t c:2;
uint32_t d:2;
};
} var;
int main(int argc, char* argv[])
{
var.value = 0;
var.a = 3;
printf("%08x
", var.value);
}
On my Linux/Intel system, this will print out 00000003. However, on other systems it could print out c0000000 (and since I don't have handy access to one, I wasn't able to double check; Solaris on Sparc comes to mind as a probable different bit-packer.)
I've sadly experienced this issue when trying to port a code base.
DUDE THIS IS SIIIIIICCCCCKKKKK... I'm going to use this all the time now. This will really clean up my function parameters.
I've been having a hard time to understand how fd_set works while working with accept() function, I couldn't wrap my head around how can they fit many different file descriptors into let's say one singular integer, great video! Thank you so much for your efforts as your videos helped me during all my learning journey.
Brilliant examples. Thank you Jacob for these tutorials.
Love your videos, explaining pretty advanced or obscure techniques in a way that is easy to understand. Keep up the good work :D
Just now implemented my first bit field. Thanks!
Finally, i finally understand bit fields!! Thank you
Thank you! Exactly what I was looking for
Thank you so much! it saves my assignment!
You're welcome. Glad I could help.
Maybe too advanced: I like to see a video about code portability:
For example, bit-field are a wonderful idea to implement protocols in embedded, until you encounter a MCU with different Endianness.
What kind of video would that be? A bitfield cannot tell you if it needs endianness conversion. You have to use other means to figure it out. Perhaps a network packet header. Perhaps a cross-compiling toolchain. Not really a bitfield-related problem.
You can use bitshifting and bitwise operators on the number 1 to check endiannes, write it out on paper and think about how to do that.
Once you determine that, you can determine how to convert them to the endianness you need, with more bit shifting operations.
man. I really love you. do you have a full C course on youtube or somthing ?
Very useful! With this technique we can save memory in structs, if the padding matches. Thank You!!!
4:28 that ampersand (&) symbol blows my mind
Usually I have to write like (((options >> 3) & 0x1) == 0x1) and like wow... there is much simple way to do a same thing 😅
Any non-zero value is "truthy", so yeah just checking the option with an & should work.
Your videos are so satisfying to watch
Thanks. Glad you like them!
Fantastic video, on a very helpful topic. Thank you.
You are crazy intelligent. Thank you
very informative, keep the good work up
Thanks, will do!
Nice Explaination Sir. Thank You. Please make video on async or sync in c programming and Waiting for data structures video
Hey Jacob! Thanks for the amazing vid, but as another user, there are some topics i find a lil advanced. Can you do a video talking about macros? Thanks again for your stuff :)
Sure, I could do that. Are you wanting macro basics, or do you have a specific macro-related question?
Fantastic stuff! I’m definitely going to start using these.
One question: Is there an advantage to using preprocessor #define over using enum for giving names to bit fields?
Hey, thanks for this excellent and very clear video.
I just have one question :
I understand what you're doing in scenario 2 but I don't understand how it relates to a bitfield. I see you're creating a struct with ints of reduced size, but where is the bitfield ?
thank you for this video
Isn't integer overflow UB? I would be careful about saying "it will eventually become negative." I believe __attribute__((packed)) is a non-standard compiler add-on. I would also wrap all the variables in your macros with parentheses just in case you get some operator precedence shenanigans.
amazing stuff! Thanks a lot!
Hello from Russia!
Hey, thanks! Glad you're enjoying the channel.
Can u make video about bit packing in c
It's neat. I wonder if it's how APL works under the hood when creating filters. You can imagine it being useful to represent indicies in an array, but probably too much hassel and maybe ultimately slower.
3:43 you did not include the link in the description :(
ruclips.net/video/iX1uGr6Si0E/видео.html here is the video
nice explanation
Great video explaining this concept. Thanks @Jacob for your video series.
Comment on example 3 printing, the pattern it prints in your for loop is reverse of how the actual bits are stored in memory conceptually. What if we do:
for(int i=64; i>0; i--){
//check if bit is set and print '+' or '.'
}
this should print the bits in the LSB to MSB order. Just thought I ask.
for(i=63;i>=0;i++) because the IS_SET_BIT macro is defined such a way (counting from 0). LSB and MSB doesn't apply here because there are bits, NOT Bytes.
great video, but I was thinking the whole time, when should I use flags and when bitfields? What is the performance and portability issue if I use bitfields with 1 bit-sizes?
A portability issue is whether the bit-field is filled in from least-significant-bit to most-significant or vice-versa. Consider:
union {
uint32_t value;
struct {
uint32_t a:2;
uint32_t b:30;
};
} var;
On Linux/Intel, var.a takes the least significant bits of value. On Sun/Sparc they [probably] take the most significant bits (don't have a system handy to test it with). I have seen a bit of code that uses such coding-structures when manipulating standards based formats that have their origins on networks. Usually the structures have conditional compilation depending on the whether or not the LSb or MSb are used first.
technically speaking (or from language feature perspective), scenario 1 & 3 are not bit fields. They're just normal bit fiddling on a given bit pattern that is of type int
Isn't it undefined behavior to overflow a signed int? I don't know if that also applies to bitfield ints but I would assume so...
Yes, it is. A pretty good discussion can be found here. I'm also assuming it applies to bit-fields. www.gnu.org/software/autoconf/manual/autoconf-2.63/html_node/Integer-Overflow-Basics.html
@@JacobSorber That's... less decisive than I had hoped. I also tried searching this up myself and it seems like no one acknowledges the existence of signed bitfields, much less overflow behavior. Of course it works, it is hard to imagine how you could implement a compiler with any other behavior, but nowadays it is important to make sure it's also in the standard lest your compiler starts deleting code like a madman.
I personally don't think there's a need for an option to have multiple set bits. It might end up messing up with other options if it is bitwise-OR'd with them
great video!
Would a bitfield of 100K to million “bools” be faster than c array of same about of bools?
Awesome!!! Wish I had your videos ages ago when I was first trying to learn this stuff.
I wonder if you can switch on all the known combinations if your bitfields are few
thanks for video))
Aren't 1st and 3rd example usually referred to as masks? I think only the 2nd one is about a bit fields.
Nonetheless great video as always.
Yes, the first time I saw this kind of things was in an assembler book, the chapter on bit masks, but later on my OS class
How about one video of reading millions of integers from a file into a memory and do some processing on them. For example, an integer number representing one bit of a big giant array... it's a commonly asked interview question and I get stuck everytime 😥
what opensource ide do you use/recommend for c/c++?
i use Mousepad + good old terminal
no ide need :P
Vim
It is amazing
Great tutorial but I wish you explained the &, | and ~ operators!
It was when I conceptually understood the program counter and its relationship with the program status word and its relationship with the actual binary instruction that I felt I really understood exactly how the computer did what it did.
I'm concerned we've become far too abstracted from the hardware and understanding how it does its work.
Hi, I just started learning Java and how to use it's apis. But I have no idea how processor and binary instructions work.
Can you direct me how to learn all these concepts...
@@newjade6075 it was in a systems analysis class after I started out by learning IBM 360/370 Macro Assembly Language and then COBOL in 1980 that I got a clear idea of the binary instruction hitting a recognizer of some sort and the specific bit of the instruction (the op code) actually triggering flags in the “program Status Word”: carry, negative, zero, branch, etc., and the computer responding simply according to how the flags were set. When I looked closely at the bit patterns of the op codes I noticed that the math commands were very similar with only a few bits different, so they always triggered the same flags except a couple depending what the actual operation was. Likewise with the move, compare, branch and other groups of instructions. A clear similarity of the bit patterns with only minor differences.
I could almost see how the actual bits in a machine language command actually triggered the appropriate flags and thus the response by the computer.
It was like a vision in my head, hard to describe in words but suddenly it all made sense how a specific binary code would trigger a specific response from a computer. After that all programming made perfect sense, the only problem I say was that people always try to abstract themselves away from this basic truth.
I prefer to be right down there on the iron.
@@newjade6075 The book "Computer Organisation and Design - the Hardware/Software Interface" by Patterson and Hennessey is an awesome book to understand the same. To understand how the processor is logically implemented using digital gates- "Digital Design" by Morris Mano. The second book in hardware oriented.
Don't EVER use the bit field notation starting at 7:00 into the video, for C/C++. That is processor and compiler dependent, and can get you into trouble. How it's defined under the hood, is not defined in the spec and can/will be different between processors and compilers.
Single bit operation came into play when I programmed "bitboards" for a toy chess engine. Wonder how bitfields might have helped here.
Lovely
Very scary macro's at the end; expansion can have very undesired outcome.
agradecido con el de arriba joven
Considering C was invented to write an operating system.... I've never understood that it doesn't have binary literals.
10:55
Might as well add a FLIP_BIT with ^=
C program
Bluetooth socket
And good package manager
just to be clear, he's counting bits starting from 0 in this program.
not beginner friendly
What does your second example have to do with bit fields? You did a poor job at explaining. Thumbs down.
Way too quick to digest info
Sorry it was too fast. Fortunately, RUclips allows you to replay at reduced speed.