Bit Hacks from Beginner to Advanced - 11 Amazing Bit Twiddling Techniques

Поделиться
HTML-код
  • Опубликовано: 28 май 2024
  • Support What's a Creel? on Patreon: / whatsacreel
    FaceBook: / whatsacreel
    Official Store: whats-a-creel-3.creator-sprin...
    In this video we explore 11 bit hacks from beginner to advanced beautifully rendered in 3D and to the music of Chopin.
    0:00 - Intro
    1:03 - Set a bit
    1:53 - Clear a bit
    2:45 - Toggle a bit
    3:38 - Convert trailing 0's to 1
    4:33 - Extracting the least significant 1 bit
    5:43 - Masked copy
    7:03 - Swapping bits
    8:30 - Population count
    10:07 - Counting bit islands
    13:07 - Bit scan forwards
    16:44 - Next lexicographic permutation
    Sources for the algorithms:
    Stanford Bit Twiddling Hacks by Sean Eron Anderson
    graphics.stanford.edu/~seande...
    Matters Computational by Arndt, Jörg
    E-Book: www.jjj.de/fxt/fxtbook.pdf
    Hard copy: www.amazon.com/Matters-Comput...
    All music from the International Music Library Project: imslp.org/wiki/Main_Page
    Chopin Waltz in C# Minor, Op. 64 No. 2, Piano: Olga Gurevich
    Chopin Nocturne 2. Andante (E♭ major) , Piano: Aya Higuchi
    Chopin Nocturne Op. 9 No. 1 in Bb Minor, Piano: Harald Vetter
    Chopin Nocturne in F, Op. 15 No. 1, Piano: Luke Faulkner
    Chopin Nocturne in F#, Op. 15 No. 2, Piano: Luke Faulkner
    Background images from animations from HDRI Haven: hdrihaven.com/
    Software used to make this video:
    Visual Studio 2019 Community: www.visualstudio.com/downloads/
    Blender: www.blender.org/
    Audacity: www.audacityteam.org/
    Davinci Resolve 16: www.blackmagicdesign.com/prod...
    OpenOffice: www.openoffice.org/
    Gimp: www.gimp.org/

Комментарии • 331

  • @alexwolski3344
    @alexwolski3344 2 года назад +259

    1:03 - Set a bit
    1:53 - Clear a bit
    2:45 - Toggle a bit
    3:38 - Convert trailing 0's to 1
    4:33 - Extracting the least significant 1 bit
    5:43 - Masked copy
    7:03 - Swapping bits
    8:30 - Population count
    10:07 - Counting bit islands
    13:07 - Bit scan forwards
    16:44 - Next lexicographic permutation

    • @WhatsACreel
      @WhatsACreel  2 года назад +21

      Oh thanks, well done! Pinned :)

    • @diegobellani
      @diegobellani 2 года назад +15

      @@WhatsACreel I think that if you copy and paste this comment in the video's description RUclips will put this chapters directly in the video.

    • @WhatsACreel
      @WhatsACreel  2 года назад +13

      @@diegobellani You're joking? That's great, let's give it a try :)

    • @diegobellani
      @diegobellani 2 года назад +4

      @@WhatsACreel It should work like this: ruclips.net/video/b1Fo_M_tj6w/видео.html

    • @alexwolski3344
      @alexwolski3344 2 года назад +12

      @@WhatsACreel For this to work, it looks like the first timestamp must be 0:00. So just add "0:00 - intro" at the beginning

  • @abonham82
    @abonham82 2 года назад +411

    For the amateurs amongst us, it would be great to see a practical application for each. Awesomely understandable though!

    • @WhatsACreel
      @WhatsACreel  2 года назад +118

      Oh, that would have been great yep! Cheers for watching mate :)

    • @Edzward
      @Edzward 2 года назад +51

      In game-making we use it a lot. Specially in tile based games. It can be used to decided which sprite to show, the state of the tile (passable/non-passable/partially-passable from each direction) if the height of the tile (bellow player/same of the player/above the player), animation state.

    • @RupertBruce
      @RupertBruce 2 года назад +41

      Encryption, compression, graphics, binary serialization all use these heavily.

    • @mytech6779
      @mytech6779 2 года назад +6

      I was going to ask similar. I can see the use for simple tasks in tiny embedded micro-controllers but it seems like it could quickly get unmanageable and complex if used extensively in a big program on a 64bit workstation. Especially if also utilizing all the extended features of modern CPUs like AVX, MMX, and SSE. (This is from the perspective of an amateur attempting to better optimize a program that could run a computation for a solid week on an 8core CPU.)

    • @tankerwife2001
      @tankerwife2001 2 года назад +31

      Among us

  • @anatheistsopinion9974
    @anatheistsopinion9974 2 года назад +100

    8:16 Warning: Supposing all code here is written using the C operator syntax, the & operator has precedence over the ^ operator, so there needs to be parentheses around the first two terms of the expression for P for the operation to behave as expected. The clang compiler actually gives you a warning when you mix different bitwise operators without specifying precedence explicitly with parentheses.

    • @diegorodriguezv
      @diegorodriguezv 2 года назад

      Same problem in python.

    • @klausschmidt982
      @klausschmidt982 2 года назад +2

      Thank you. I don’t know why I immediately dismissed operator precedence when I tried to figure out why it wasn’t working.

    • @proloycodes
      @proloycodes 2 года назад

      another man of non-culture i see

  • @TerjeMathisen
    @TerjeMathisen 2 года назад +40

    Funnily enough, before we got popcnt and similar opcodes, the Kernigan loop wasn't the fastest way to count the number of set bits, particularly if you had a very long array of words where you needed to count the bits: It is faster to use another bit hack which performs bitslice operations to compress (using full adders) 7 or 15 words into 3 or 4, then count the bits in each of these!
    An intermediate idea uses in-register table lookups to count the number of bits in each nybble, doing so in parallel for up to 32 nybbles or 16 bytes with AVX operations, or (foregoing SIMD) using a 45-bit constant to convert a nybble index into a 0-4 count. Both of these methods are faster than the (x & -x) when many bits are set.

    • @ulysses_grant
      @ulysses_grant 2 года назад

      Holy moly, I think you sir had lived in the golden era of computation to build such knowledge! Amazing comment!

    • @AussieLuke0123
      @AussieLuke0123 2 года назад +2

      Ron Gutman's Algorithm Alley in September 2000 Dr Dobb's Journal had a good explanation of this popcount approach. Still available to view online, but RUclips wouldn't allow links.

  • @dankillinger
    @dankillinger 2 года назад +62

    Really nice job with the visuals on this video, very concise and clear! (Also love the music)

    • @WhatsACreel
      @WhatsACreel  2 года назад +2

      Cheers mate! Thanks for watching :)

    • @stefanschacht3322
      @stefanschacht3322 2 года назад +4

      Heyy man, also love the visuals and the content is as always extraordinary, but the piano is way too loud and clipping. Just saying... I am hypersensitive in audio... ;)

    • @WhatsACreel
      @WhatsACreel  2 года назад +5

      @@stefanschacht3322 Cheers mate!! I can turn the music down a little in the future, thanks for letting me know :)

  • @ProjectPhysX
    @ProjectPhysX 2 года назад +31

    Really nice tricks!
    13:54 There is a potentially quicker way to do BSF: Count leading zeros using floating-point cast and extracting the exponent. This can also be used to compute log2 to a single digit.
    int count_clz = 31-((as_int((float)x)>>23)-127);
    For BSF you would do:
    x = x & -x;
    int count_bsf = (as_int((float)x)>>23)-127;
    The as_int function is:
    int as_int(const float x) { return *(int*)&x; }

    • @WhatsACreel
      @WhatsACreel  2 года назад +10

      Oh that's gold!! Reminds me of Jim Blinn's floating point trickery! Really nice stuff :)

    • @abonham82
      @abonham82 2 года назад +7

      Feels like it may have been a precursor to the fast inverse square root godhack from Quake 3

    • @abonham82
      @abonham82 2 года назад +3

      Really should have done more reading so I’d know that the history of the Q3 trick includes Jim Blinn

    • @II-ii2um
      @II-ii2um 2 года назад

      @@abonham82 Is that the trick that Creel's referring to or is there another one as well?

    • @ApplepieFTW
      @ApplepieFTW 10 месяцев назад

      @@II-ii2um From the way he wrote the comment, it seems as if he (so, abonham) is (only) referring to the Q3 inverse sqrt. But, just like you, I'm not actually sure about the answer to your question.

  • @azertyuiop7893
    @azertyuiop7893 2 года назад +14

    It's always a pleasure to see your vids about low level things. Never stop posting, please.

    • @WhatsACreel
      @WhatsACreel  2 года назад +4

      Cheers mate, thanks for watching :)

  • @davidchavez657
    @davidchavez657 2 года назад +16

    I found bitwise functions pretty intimidating until I took a digital logic design coarse where we were designing bit shifters and ALUs then it all made perfect sense. Nice hacks!

  • @reillocb
    @reillocb 2 года назад +3

    Words fail to describe how indispensable this channel has become. The music, the accent, the animations, the passion, I haven't felt this rejuvenated by learning since childhood. You made me actually enjoy assembly when the school system had all but ruined it for me.
    You're like PBS for comp sci, and I I'm living for it

  • @Tumbolisu
    @Tumbolisu 2 года назад +81

    Bit shifting a number to the right does not always fill it with 0s on the left! The sign of the number matters. Make sure to use unsigned types when performing these operations.

    • @WhatsACreel
      @WhatsACreel  2 года назад +15

      Well said!

    • @jakistam1000
      @jakistam1000 2 года назад +5

      But will "-x" work then?

    • @transitpointmusic
      @transitpointmusic 2 года назад

      For a signed type, right shift with a fill of ones would still work right? And a lot of instruction sets can do that iirc

    • @transitpointmusic
      @transitpointmusic 2 года назад

      Ah nvm I misunderstood, that would then stop some of the hacks working correctly, lol

    • @bultvidxxxix9973
      @bultvidxxxix9973 11 месяцев назад

      @@jakistam1000 Yes, negating an unsigned integer will convert it to the two’s complement.
      The bitshift problem is actually even worse though. Shifting a negative number to the right is implementation defined (and usually fills it with the MSB). But shifting a negative number to the left is actually undefined behaviour. So the compiler is allowed to assume that number is never negative to do optimizations.
      All of this is for C++, other programming languages may define other behaviour.

  • @sakari_n
    @sakari_n 2 года назад +33

    The masked copy or just a blend as i have seen it usually called is quite useful. It can be used as a basis for branchless SIMD routines. Also for other branchless stuff, but processing multiple items in the same time with one thread requires that the items done take different branches. Also there exist blend instructions to combine the different steps to just single instruction in x64.

    • @aperson4051
      @aperson4051 2 года назад +2

      As good programmers though, don't we want to be thinking at a higher abstraction?
      Also, I've heard that compilers and even CPU front ends can do really amazing optimizations that often result in branchless code anyway so we should focus on being expressive, and let the tools make the code fast for us.

    • @bagok701
      @bagok701 2 года назад +8

      @@aperson4051 How do the compilers learn to do these optimizations; people had to add it in at some level, whether that was abstracted or not.

    • @sakari_n
      @sakari_n 2 года назад +4

      @@aperson4051 The issue is that pretty much all modern SW is slow and laggy and i do not want to make the current situation any worse. It is so bad already. And if you had ever read output of modern compiler you would not make statements like compilers can do really amazing optimizations, because they usually do not. Auto vectorisation works poorly and branch removal works mostly in obvious cases only.

    • @protowalker
      @protowalker 2 года назад +5

      @@sakari_n the majority of software performance problems come mostly from architectural issues rather than optimizations. The compiler can't optimize away a bad algorithm or data structure but you should be able to trust it with low-level stuff like this

    • @sakari_n
      @sakari_n 2 года назад +2

      @@protowalker That is true that the biggest slowdown comes from unnecessary complexity and weird data structures that comes with that complexity. The slowdown from all this extra stuff is the main issue. The slowdown effect from lack of optimization is significantly less than the slowdown effect from all the extra stuff that usually includes all sorts of useless data structures. In my experience of doing random programming for over a decade and couple of years professionally. I have realized that you can solve almost all problem by just doing these two simple steps #1 collect data to process in to an array, #2 iterate over the array and process all data. You do not even need to optimize it. I almost never do. And still just by doing the simplest thing without any weird data structures or algorithms you can process data with such a speed that you would thought to be impossible when comparing to modern software. Even if you are not doing optimization you should still keep in mind the possibilities given you be the computer. Also in my experience compiler do not do amazing job when it come to this type of low-level stuff, but they do OK job at least most of the time. I have seen something that can only be described as catastrophic failures. Still the quality of the optimization becomes irrelevant when the entire program is bloated.

  • @k7iq
    @k7iq 2 года назад +11

    And the background piano music was perfect for this !

    • @WhatsACreel
      @WhatsACreel  2 года назад +4

      Thanks mate, Chopin is fantastic :)

    • @robertbruce7686
      @robertbruce7686 2 года назад

      Man he must look ? wearing a tutu whilst doing video...

    • @WhatsACreel
      @WhatsACreel  2 года назад

      Ha!! Yep, I was definitely in a tutu :)

  • @RightMeow28
    @RightMeow28 Год назад +1

    Love the Chopin soundtrack.

  • @johannmassyn6709
    @johannmassyn6709 2 года назад +6

    Counting bit Islands nice little trick, I'll have to find a place to use it.

    • @WhatsACreel
      @WhatsACreel  2 года назад +2

      It's a great trick for sure! Thanks for watching mate :)

  • @thatcrockpot1530
    @thatcrockpot1530 2 года назад +8

    THIS is the way my teacher should've introduced me to bitwise operation and I might ~had been blown away by straight forward logic as a grown man, smh.
    Thank you very much Creel.

  • @redpepper74
    @redpepper74 2 года назад

    24:10 “Hey presto” is my new favorite phrase

  • @II-ii2um
    @II-ii2um 2 года назад +1

    Got busy & missed your videos for a while mate. But now im back for more.Your assembly playlist was literally the force that kept me going and helped me land a job in the semiconductor industry.
    So thankyou from the bottom of my heart. Your next coffee is on me . Cheers!

  • @Zman2024
    @Zman2024 2 года назад +2

    always great to see another video from you creel, hope you know how much we appreciate these vids!

  • @ricos1497
    @ricos1497 2 года назад +3

    Amazing. I especially loved the shininess of the dice.

  • @phasm42
    @phasm42 2 года назад +43

    One gotcha regarding bit-shifting: the shift amount may be masked to limit it to less than 32 or 64 bits. E.g. (n >>> 64) will return n, not zero, and (n

    • @WhatsACreel
      @WhatsACreel  2 года назад +18

      Oh that's a great point! Yep, we have to be careful to mask our shifts properly or we can end up with some pretty funny code :)

    • @phasm42
      @phasm42 2 года назад +3

      @@WhatsACreel it's caused me a lot of grief in the past, e.g. shifting by 64 bits (from a variable, not a constant of course), expecting it to result in zero, but getting the original value untouched.

    • @perojonsson
      @perojonsson 2 года назад +3

      Well, in C++ shifting an amount equal to or greater that the number of bits in n is undefined behaviour.

    • @phasm42
      @phasm42 2 года назад +3

      It is defined in Java though, specifically that the shift amount is masked.
      I don't think Javascript addresses this, although it behaves the same way.

    • @williamdrum9899
      @williamdrum9899 2 года назад

      I can't think of any CPUs where it wouldn't become 0 (or FFFFF... for ASR) when you exceed the bit size

  • @Bunny99s
    @Bunny99s 2 года назад +24

    I'm disappointed to call the bit island counting advanced and you still did a divide by 2 instead of a right shift by 1 ^^. I thought we talk about bit hacks :)

    • @williamdrum9899
      @williamdrum9899 2 года назад

      I think he did that on purpose for readability. The compiler/programmer will optimize it anyway ;)

  • @adancalderon8915
    @adancalderon8915 2 года назад +1

    Very cool. I really enjoyed the video. I appreciate all the work that must have gone into producing it.

  • @CShep99
    @CShep99 2 года назад +5

    this feels like a video from one my favorite science RUclips channels: "Physics Videos by Eugene Khutoryansky" but instead of physics, its programming stuff! :D

    • @WhatsACreel
      @WhatsACreel  2 года назад +1

      I was certainly inspired by those awesome videos!! :)

  • @xorsirenz
    @xorsirenz 2 года назад

    been a year since ive been on a pc, glad to see you're still posting regularly! its time to get back to learning ASM, thanks for the great content btw!

  • @scorch855
    @scorch855 2 года назад

    Yet another awesome video. Thanks for making this mate!

  • @Intermernet
    @Intermernet 2 года назад +4

    Awesome video mate :-) One of my favourite books is "Hacker's Delight" by Henry S. Warren Jr . I randomly open it to read some crazy bit-hackery, and it's actually proven to be more useful overall than that copy of Knuth I have sitting on my shelf ;-)

  • @billowen3285
    @billowen3285 2 года назад +1

    Woah cool animations, nice music, great explanations!

  • @deeeee487
    @deeeee487 2 года назад

    some of these i never seen before. i like the description of isolating a bit or set trailing 0s to 1 etc.
    some of the advanced ones had my mind blown.

  • @RurikLoderr
    @RurikLoderr Год назад +2

    I've been making a roguelike in c and I've been using the whole bit flipping thing to save space. Each tile has 8 bits that is used to determine whether a tile is transparent, solid, visible, seen, or etc. Works crazy well and it cut the size of each tile in the map by something like 4-5 bytes. I've got it down to only using 140k at start up.

  • @Zooiest
    @Zooiest 2 года назад +1

    This is a beautiful work of art

  • @axnaksjcknjas3395
    @axnaksjcknjas3395 2 года назад

    Love your amazing visuals!

  • @jangofett132
    @jangofett132 2 года назад +1

    great video like always. neat graphics too.

  • @angeldude101
    @angeldude101 11 месяцев назад +1

    The code for Bit Scan Forwards was so beautiful that I immediately pulled up logisim to make an implementation with just logic gates. (x != 0) is really just an n-input OR gate, and those branches adding to the count are really completely independent. Each adds a different power of two, so you can really just output the results of the 8-way ORs directly into the bits of the output. Since each mask is constant and cancels out half the bits, then you can just not connect the masked off wires and instead use (n/2)-input OR gates.

  • @amruthsar
    @amruthsar Год назад

    Wow, I love this video. The illustration, the music, the voiceover, everything is just perfect. I love this video so much because I can watch this video and fall asleep while I am learning something! That is just supercool.

  • @haremari
    @haremari Год назад +2

    Regarding bit hack #6, the masked copy:
    The XOR operator is often woefully underappreciated. Instead of the masked copy using
    A = (B & M) | (A & ~M)
    I often find myself using something like
    A = A ^ ((B ^ A) & M)
    which may lead to shorter assembly and fewer clobbers. If you can design your algorithm to work with an inverted mask ~M in the first place (a "keep-mask" instead of a "copy-mask", so to say),
    A = ((A ^ B) & ~M) ^ B
    may turn out even better.
    Regarding bit hack #11, the next lexicographic permutation:
    If you're on embedded hardware without a bit scan instruction you'd start out by calculating X - 1, then T, then T + 1 as in the video. If you then take X ^ (T + 1), you'll get a set bit wherever there was a change in the transition from X to (T + 1). This will be one single compact "island" of k set bits, comprising the single bit that got set, and below that, the k - 1 bits that got cleared.
    We now need to reintroduce k - 2 set bits at the bottom. So we simply shift the "island" to the right until the lowest bit is set, and then by 2 more places (alternatively: until the carry is set, and then one more place). In pseudo-assembly:
    MOV T, X ; make a working copy
    SUB T, 1 ; use the #4 bit hack ...
    OR T, X ; ...to set all trailing 0 bits
    ADD T, 1 ; clear all trailing 1 bits and set a single 0 bit above them
    XOR X, T ; get the island of bits thus changed
    LP: LSR X ; repeatedly shift it to the right...
    JNC LP ; ...until we see a carry indicating the lowest set bit fell off
    LSR X ; shift right once again, so 2 set bits were discarded in total
    OR X, T ; and rejoin the remaining bits with the result

  • @EnderMega
    @EnderMega 2 года назад +1

    Dam, this was awesome, the content, the explanation and the editing, VERY nice :D

  • @billeroonitheevileggfarmer4845
    @billeroonitheevileggfarmer4845 2 года назад

    your animation has gotten really nice, great bit tricks

  • @mytechnotalent
    @mytechnotalent 2 года назад

    One of the most relevant and important videos of all time. If you are ever going to get into driver development this is a video for you.

  • @ihatethesensors
    @ihatethesensors 2 года назад

    Awesome video!

  • @Axman6
    @Axman6 2 года назад +1

    I came here expecting 11 bit hacks, but was pleasantly surprised when I got 1011!

  • @suncrafterspielt9479
    @suncrafterspielt9479 2 года назад +2

    The animations are superb

  • @ozne_2358
    @ozne_2358 2 года назад

    Truly excellent video!

  • @_Stin_
    @_Stin_ 2 года назад

    That last one was beautiful lol

  • @greob
    @greob 2 года назад

    Nice tricks! Very cool!
    I also love Chopin, to the point where it distracted me more than once while watching. ;)

  • @singularity3724
    @singularity3724 7 месяцев назад

    I was so happy to see the next lexicographic permutation! Recently, I was practising implementing an algorithm to find Hamiltonian paths on graphs, and it seemed that the stated complexity could only be achieved by finding the next lexicographic permutation in constant time. I ended up just sacrificing some complexity because I didn't know how to do it efficiently. Now I do!

  • @sleve_mcdichael
    @sleve_mcdichael 2 года назад

    Very nice animations!

  • @muggish19
    @muggish19 2 года назад

    Amazing video, ty

  • @lx2222x
    @lx2222x 2 года назад +2

    really really nice job, the animations are so wonderful 😍

    • @WhatsACreel
      @WhatsACreel  2 года назад +2

      Cheers mate, now I know it was worth the 100 hours of rendering time, haha :)

    • @alexwolski3344
      @alexwolski3344 2 года назад +2

      @@WhatsACreel Oh nooo you didn't use the cycles renderer did you!? You mad lad. So much dedication.

    • @WhatsACreel
      @WhatsACreel  2 года назад +2

      Nah mate, it was Eevee! I gotta say, modern Eevee is really excellent!! Actually the rendering was pretty quick :)

  • @cffinch44
    @cffinch44 2 года назад +2

    As a perpetual newb I am always eager to learn. I enjoyed this post but would have loved a quick reference to why each hck would be needed (how it might be usedconcretely).

  • @jonathanseagraves8140
    @jonathanseagraves8140 2 года назад

    This is a really great video.

  • @pdx-sw3hj
    @pdx-sw3hj 2 года назад

    Your video is top-notch in terms of everything!

  • @rodrigotobar3606
    @rodrigotobar3606 2 года назад +1

    Great work! Went through half of the video and we were down to 9/11 hacks. I innocently thought there was going to be some extra content after the last two...

  • @fartzy
    @fartzy Год назад

    This is so amazing. Happily subscribe to this channel after this video

  • @ImperiumLibertas
    @ImperiumLibertas 2 года назад +1

    The fact this list of tricks didn't start at 0 is a crime

  • @user-nz3er3lq3r
    @user-nz3er3lq3r 5 месяцев назад

    Amazing video

  • @emotionalDMG459
    @emotionalDMG459 2 года назад

    Love your chanel!

  • @MrJoerT
    @MrJoerT 2 года назад +1

    This is great! I do all this stuff naturally but I find it very difficult to explain to my co-workers. This will help a lot!

    • @MrJoerT
      @MrJoerT 2 года назад +2

      I stand corrected. I do 'some' of this stuff naturally. Great video!

  • @x1expert1x
    @x1expert1x 10 месяцев назад

    this channel is an absolute goldmine

  • @wfzyx
    @wfzyx 2 года назад

    Surprisingly I knew half of the tricks, yet very cool video, I learned a lot. thanks for sharing!

    • @WhatsACreel
      @WhatsACreel  2 года назад +1

      That's great mate! Glad you liked it, thanks for watching :)

  • @diegonayalazo
    @diegonayalazo 2 года назад

    Thanks Creel!

  • @swordofkings128
    @swordofkings128 2 года назад

    Oh hell yes!!! New video!!

  • @_wouter52
    @_wouter52 2 года назад +1

    Good stuff indeed! Next time I need to do some complex things with bits, I'll check out this video again :D

  • @thefekete
    @thefekete 2 года назад

    Thanks for the vid! I feel like we program in wildly different fields, but always find your videos super useful!
    Kind of like when a number theory discovery results in a biology break-through or something :P

  • @Cole.Varial
    @Cole.Varial 2 года назад +2

    These videos remind me of Eugene Khutoryansky's physics videos. complex ideas demonstrated in a way a toddler could understand, which is to say incredibly well. Thank you.

  • @zxcghoul1275
    @zxcghoul1275 2 года назад +1

    you are legend. thanks for learning

  • @volatus2354
    @volatus2354 2 года назад

    Amazing video!

    • @WhatsACreel
      @WhatsACreel  2 года назад

      Glad you liked it, thank you for watching :)

  • @peterjackman1507
    @peterjackman1507 2 года назад

    Awesome animations

  • @jankinsics
    @jankinsics 2 года назад

    Interesting. It’s very useful to obfuscate operations. Thanks so much. 😆

  • @waaaaaaah5135
    @waaaaaaah5135 2 года назад

    This is really cool!

  • @tunafllsh
    @tunafllsh 10 месяцев назад +1

    Didn't know about this neet formula for lexicographical permutation. Might be useful for some bruteforcing stuff.

  • @rafa_br34
    @rafa_br34 2 года назад

    woah, such a masterpiece

  • @1nd3xs
    @1nd3xs Год назад

    wow.. good video thank

  • @eusebius7155
    @eusebius7155 2 года назад

    Amazing intro music. I like your taste

  • @dryoldcrabman6890
    @dryoldcrabman6890 2 года назад

    holt the animations in this are crazy

  • @Nasiulciaa
    @Nasiulciaa 2 года назад +5

    oh wow, I know and probably used all of the first 10 ones (writing board game engines for fun does that to people), but the last one is something I would've definitely used if I knew it exists because I actually needed it not too long ago u.u
    I needed all possible permutations of a 32 bit variable that has 3 bits set and I kinda got an idea how to do it using loops, but that made my soul cry with how awfully huge the time complexity is I just scraped that idea
    while I technically could've continued with the loops (just three nested loops, first one from 0 to 29, second one from i+1 to 30 and third one from j+1 to 31), after some time I realised that I would like to have code that works not only for three, but also for any given "n" of bits, and changing the number of loops every time I decide I want a different "n" would be just terrible code
    this thingie would actually make it work so easily, just take zero, set the lowest n bits to 1 and spam the next lexicographic permutation thingie until I've checked all of them

  • @surman1816
    @surman1816 2 года назад

    lovely !

  • @coefficient1359
    @coefficient1359 2 года назад +2

    Thank you!

  • @hbibamrani9890
    @hbibamrani9890 2 года назад +1

    This man is a legend!

  • @realcygnus
    @realcygnus 2 года назад

    Good Stuff !

  • @Rudxain
    @Rudxain Год назад

    The BSF (AKA CTZ and FFS) binary search algorithm is only efficient in fixed-precision! For anyone dealing with arbitrary precision, you must use a different algorithm:
    1. Do a trivial linear search from the least significant word in the BigInt, until you find a word whose bits are not all-zeros. This is faster than iterating over individual bits, because CPUs can handle 1 word per clock tick.
    2. For every word that you iterate, add the wordsize to the counter. For QWords, add 64, DWords are 32, etc.
    3. When you find a non-zero word, switch from linear to binary search. Forget all words and focus only on the current word. Now you can use the same algorithm shown in the video.
    4. Return the count

  • @thenoblegnuwildebeest3625
    @thenoblegnuwildebeest3625 2 года назад

    Love the Chopin.

  • @billy65bob
    @billy65bob 2 года назад +1

    You sort of covered one of my favourites in "population count": x & (x - 1)
    It's incredibly useful for testing if x is a power of two; I've never had to do a popcnt with it though.

  • @muskyoxes
    @muskyoxes 2 года назад +6

    The main missing piece that comes up for me is nlz - number of leading zeroes, your bit scan forwards from the other side. Getting the "order of magnitude" of a number comes up again and again and i hate how nasty it is to do what should be a standard instruction as easily accessible as xor.

    • @WhatsACreel
      @WhatsACreel  2 года назад +4

      True, true!! We can get it from the exponent of the float if we do a conversion. Pretty messy. I'm sure there's better bit hacks out there for that, but I can't think of any at the moment. Well, cheers for watching anywho :)

  • @avizazati9961
    @avizazati9961 Год назад

    1 min into the vid... music is really nice !!

  • @williamdrum9899
    @williamdrum9899 2 года назад

    Thanks for making this so clear, I was able to code all of these in Z80 Assembly in 20 minutes after watching the video. I think you forgot a left parenthese in the last one, should be three in a row for the big expression after the bitwise OR

  • @thequeenofspades
    @thequeenofspades Год назад +1

    I'm studying computer science at university. If I ever start to feel like I'm a modern Alan Turing I'll watch a video like this to -remind myself I know fuck all- stay humble.

  • @LL-ol8gr
    @LL-ol8gr 2 года назад

    Earned my sub! Subsets of bits is missing!

  • @Dave_thenerd
    @Dave_thenerd Год назад +2

    @Creel
    16:45 How would I go about finding the previous (lower) lexicographic permutation?
    Thanks in advance.

  • @luisca92
    @luisca92 2 года назад

    I am hooked on your videos! I'm a self-taught engineer I guess, I've have made my career literally on mostly convoluted "if"statements in c++, and python, I'm taking a code that would make any engineer probably cringe and immediately ask why I'm doing things that way when it's obvious that the way you do "x" is this other proper way with adequate functions. But for me it's hard enough to even make an attempt at grasping these concepts so till then add another layer of complexity it's just a no-go for the moment, i focus instead on just getting the thing whatever it may be to first work even if barely and or buy a threat and then troubleshooting to my best ability why it's breaking rather than trying to complete the project by coming up with the most proper solution..Seeing these deeper layer of the onion you do such an amazing job at sharing is making a lot of things click in my mind.. Like I always wondered what bitwise actually was, like it's something that when you read about it I just find description of what it and how to use it but not how it does what it does, these videos feel like seeing behind the curtain or looking under the hood of the vehicle that drives this current civilization

    • @williamdrum9899
      @williamdrum9899 2 года назад

      Here's a few fun "bits" of info.
      When you multiply, divide, or modulus by a power of 2, the computer can do that using bit twiddling which is a lot faster. Case in point, imagine this struct:
      struct foo {
      char x;
      char y;
      char z;
      }
      struct foo bar[32];
      There's a good chance the compiler makes each struct actually 4 bytes (the "extra" byte is padding and is never read or written). This is done for the sole purpose of avoiding having to multiply by 3!

  • @ojonasar
    @ojonasar 10 месяцев назад

    2:01 - buy it flowers, take it out for dinner.

  • @andrewglick6279
    @andrewglick6279 2 года назад +1

    This was absolutely fascinating! Though, after about 2^3-1 minutes, most of the content of this video was shifted into one ear and shift out the other. 🙃

  • @cmac37420
    @cmac37420 2 года назад

    Logic is beautiful.

  • @willofirony
    @willofirony 2 года назад +1

    Thank you for this. I am using an expandable bit map as a record allocation table. Now if I am using visual C++ I can use compiler intrinsic functions for POPCOUNT and BSF and it doesn't get much faster than that. BUT That only works with visual C++ and intel CPUs; the GNU compiler uses intrinsic functions very differently and ARM based computers don't recognise the CPU commands anyway. The expressions you demonstrate are slower but fast enough, more importantly they are completely cross platform. I would never have come up with Kernighan's POPCOUNT expression in a month of Sundays. Thanks again.

    • @MrGeorge1896
      @MrGeorge1896 2 года назад

      gcc and clang (for both Linux and the BSDs) offer builtin-functions which produce the proper assembly instruction for the used architecture. e.g. on x86_64 __builtin_popcountl() gives popcntq and __builtin_ctzl() gives tzcntq. (count trailing zeros) Use -march=native flag as some of there instructions were not part of the original x86_64 instruction set so they are not used for compatibility reasons with the oldest AMD Athlon64 from 2003.

    • @willofirony
      @willofirony 2 года назад

      @@MrGeorge1896 thank you for that. Yes, I am cognizant of the x86_64 __builtin functions but that require keeping two sets of source files . Nevertheless, it was kind of you to pass your wisdom on.

    • @MrGeorge1896
      @MrGeorge1896 2 года назад +1

      @@willofirony just to make it a little clearer these builtin functions are NOT x86 specific (32 or 64 bit) they are generic and work for ALL architectures, x86, ARM, POWER whatever... If there is a specific instruction for your platform gcc and clang will use it otherwise they will use a subroutine call.

    • @willofirony
      @willofirony 2 года назад +2

      @@MrGeorge1896 Now, thank you for that. The plurality of targets had escaped me completely. You made an old man very happy,

  • @Hooghog
    @Hooghog 2 года назад

    Thank you for this fantastic video, so well explained and entertaining!! I'm new to bit hacking and am curious - as an alternative to BSF, you could get the index of the least significant bit by finding the logarithm of the least significant bit: log2(x&-x). Is this an invalid bit hack because it uses log?

    • @sodiboo
      @sodiboo 2 года назад

      From ProjectPhysX's comment, you can convert to a float and then bitcast back to an int (aka convert the same value in float representation, interpret as int) and extract the exponent there to find the logarithm, which is a bit hack to find the log, so you can kinda do it with bit hacks
      but then again, BSF is kindof an implementation of log2 that works when the exponent is an integer, so that way it wouldn't be a bithack (you'd need some implementation of log2 with bithacks as well)

  • @csbluechip
    @csbluechip 2 года назад +6

    a) Why not use >>1 instead of /2 ?
    B) "Why?" This really needs a second video giving an example use for each operation

    • @proloycodes
      @proloycodes 2 года назад +3

      because /2 slooow

    • @terrorist_nousagi8747
      @terrorist_nousagi8747 2 года назад

      Dividing is really slower than bit shifting. In older games like Quake they tried to evade using division whenever possible. There is a video that shows the use of this in a algorithm to calculate the inverse square root.

    • @csbluechip
      @csbluechip 2 года назад +1

      @@terrorist_nousagi8747 ruclips.net/video/uCv5VRf8op0/видео.html ??

    • @williamdrum9899
      @williamdrum9899 2 года назад +3

      Readability was the most likely reason, we all know a bit shift is faster 😉

  • @EvilStreaks
    @EvilStreaks 2 года назад +1

    I didn't know any of these. Interesting methods. I'll implement some of them. But after the first few I can't think of a reason for doing them - especially the last one. Then again I'm used to BASIC, so..

  • @antoniocs8873
    @antoniocs8873 2 года назад

    Very nice, lots of information. I think the blender animations were a bit much a simple pen and paper approach would have been sufficient but thanks again for the info

    • @bornach
      @bornach 2 года назад

      Well 3blue1brown could also do everything using pen and paper but would RUclips have started recommending his video?

    • @antoniocs8873
      @antoniocs8873 2 года назад

      @@bornach I'm just giving my opinion... chilax I've seen some of his other videos where he just uses Visual studio and I think (again in my opinion) that all this blender stuff, while pretty, doesn't really give that much in terms of explanations and it probably took him a while to do all the animations (which again are quite nice)

  • @stevojohn
    @stevojohn 2 года назад +1

    Really interesting video, but for many of these I can't think of a real-world use case.

  • @den2k885
    @den2k885 2 года назад +1

    10 years programming in C and Assembler and I never though of #3.

  • @cervi6538
    @cervi6538 2 года назад

    6:39 you can use A = (B & M) | A instead which saves some processing

    • @williamdrum9899
      @williamdrum9899 2 года назад

      Wouldn't the compiler do that for you anyway?

    • @cervi6538
      @cervi6538 2 года назад

      @@williamdrum9899 idk but it's shorter and less redundant imo

    • @billy65bob
      @billy65bob 2 года назад +1

      that's a masked or, not a masked copy; it won't copy the 0's from B.
      Just to keep it simple: A = 101 B = 010 M = 011
      With your proposed statement, the result will be 111; but we wanted 110