Square & Multiply Algorithm - Computerphile

Поделиться
HTML-код
  • Опубликовано: 13 апр 2022
  • How do you compute a massive number raised to the power of another huge number, modulo something else? Dr Mike Pound explains the super-quick square & multiply algorithm.
    Numberphile's Witness Numbers video which inspired Mike: • Witness Numbers (and t...
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Комментарии • 309

  • @pleasedontwatchthese9593
    @pleasedontwatchthese9593 2 года назад +473

    For the people who have worked with assembly programming they will be really use to these. In the past CPUs did not have multiply and you often had a table of the fastest way to multiply a numbers. Which you guessed it was shifts (which is like a square) and addition

    • @NoNameAtAll2
      @NoNameAtAll2 2 года назад +30

      even modern cpus have shortcuts for squaring and for multiplying by small numbers, so this algo is still benefitial

    • @klaxoncow
      @klaxoncow 2 года назад +26

      Johnny Ball covered this on Numberphile before - search "Russian multiplication".
      Let's call the two numbers A and B.
      With number A, we shift all the bits to the right once.
      The rightmost bit "falls out" of the number and typically gets shifted into a flag on most CPUs - let's say our CPU shifts the bit out of the right into the carry flag.
      So, the carry flag now contains the rightmost (least significant) bit of number A. If it's a one, then we add number B to our running total. If it's a zero, carry on (so you could code that as a "branch if carry not set" over a "add B to total" instruction).
      Then we shift B left one, which doubles it.
      Then we do it again. Shift A to the right. The rightmost bit falls out into the carry flag. If it's a one, then we add B (which we've just doubled, remember) to the running total.
      Shift B left one, doubling it again.
      Shift A to the right. If it's a one, then add B (now four times bigger than it originally was, as we're shifting it left every round) to the running total.
      You keep doing this until you've shifted every original bit of A out of the right side. Stop. You've multiplied the two numbers together and your answer is in the running total register.
      And all we did was shifting bits left and right, and simple addition. And there's only as many "rounds" of this as there are bits in A.
      Another bit of useful binary maths is that the result cannot be more than twice the number of bits in B. That is, if A and B are 8-bit numbers, the result register only needs to be 16-bits - as it's just not possible for two 8-bit numbers to multiply to more than a 16-bit number.
      Indeed, if you're implementing this algorithm, then stick B in a register with twice as many bits and have your "running total" register be twice as many bits. Then you can run the algorithm blindly.
      Which is, as you may have guessed, what the CPU's actually doing in the circuitry with a hardware multiply.
      It's just multiplying by 2 - which, in binary, is nothing more than shifting all the bits to the left once - and addition.
      So, yeah, it's basically the same algorithm as this video, but working in a higher order. So multiplication -> exponential. And multiplying by 2 -> squaring. And addition -> multiplication.

    • @RegrinderAlert
      @RegrinderAlert 2 года назад +3

      @@NoNameAtAll2 Are those tables actually part of the CPU (making use of microcode) or done by a compiler?

    • @NoNameAtAll2
      @NoNameAtAll2 2 года назад +1

      @@RegrinderAlert multipliers are simple enough to be just logic gates
      what I was talking about is "check if top 48 bits are 0, so we don't need to wait/use most of the circuit"
      that is too, simple enough to be just logic
      about compilers... idk how common the explicit "square" command is in processors

    • @FrankHarwald
      @FrankHarwald 2 года назад +9

      yes, except that's a shift & add algorithm, & shifts aren't like squaring but like multiplying by a power of 2.

  • @GeorgeBratley
    @GeorgeBratley 2 года назад +384

    I think the last bit of the video is facinating - that you could perform an attack to work out a key based on the CPU time to calculate a square vs a square & multipy. A great example of the theoretical mathematics being ideal vs. the real world implementation being fundamentally vulnerable.

    • @nebuleon
      @nebuleon 2 года назад +54

      Yes! And the technical term for it is a "timing attack".
      Timing attacks can be so insidious that you need to resort to assembly language just to get everything out of the way.
      Dr Pound's example has us doing an "always multiply", multiplying by one (the multiplication identity) after squaring for an unset bit in the exponent, so that we execute the algorithm in constant time. However, using a statement like [if (bit is zero) { multiplicand = one; } else { multiplicand = base; }] to do this "always multiplication" can end up in the *branch predictor's way.* If there are lots of zeroes in your key, it's going to take the "if" path more of the time; conversely, if there are lots of ones in your key, it's going to take the "else" path more of the time. Either way, the branch predictor will execute it faster than if the key were evenly-distributed ones and zeroes.
      To execute *that* in constant time, the CPU has to have a branchless test instruction to set the new multiplicand. Either you validate that the compiler uses the branchless instruction in your C code (say), or you write at least that part of the algorithm in assembly language.
      Edit: Or use the Montgomery form of the numbers, per David Gillies's comment, which makes it easier to have constant-time algorithms

    • @2Cerealbox
      @2Cerealbox 2 года назад +25

      In data centers that have servers that the government uses, the government requires that their servers are plugged into an air-gapped power supply, unconnected to the power that every other server uses, so that a spy couldn't surreptitiously measure changes in their power usage. These are surprisingly effective attacks.

    • @mbican
      @mbican 2 года назад +18

      That's why cryptographic implementation need to have constant time, no optimization allowed for multiplication by zero.

    • @domogdeilig
      @domogdeilig 2 года назад +1

      @@nebuleon Multiply by base^binary. Thus if there is a 0, it will be multiplied by 1, and for 1 it's the ordinary multiply. As both x^n where n is 0 or 1 is easily calculated that should be same time (?).

    • @justinreusnow
      @justinreusnow 2 года назад +2

      Why not just simply add a random sleep at the end of the algorithm? The size of the sleep would be a question, but if it’s a random amount each time that is sufficiently large to mask any work being done (or lack there of), it would remove this timing attack issue. It’s certainly wasteful to simply sleep, but it’s also wasteful to do unnecessary calculations to remain in constant time, no?

  • @thuokagiri5550
    @thuokagiri5550 2 года назад +206

    Dr Pounds breadth and depth of knowledge in computer science never cease to amaze me!!
    "Man from the future"

    • @Ins4n1ty_
      @Ins4n1ty_ 2 года назад +7

      Absolutely, but this specific piece of knowledge is pretty much common knowledge for anyone in CS. I studied this in college about 7 years ago, it was NOT a fun time since we had a pretty bad professor...

    • @quincy2142
      @quincy2142 2 года назад +3

      Not necessarily the breadth, but connecting the theoretical with the practical. Bit on timing attacks was pretty nice.

    • @thuokagiri5550
      @thuokagiri5550 2 года назад +2

      @@quincy2142 he has a very impeccable pedagogy

    • @jaydeep-p
      @jaydeep-p Год назад

      Even if he doesn't have the knowledge I still like his teaching style, it's engaging and fun.

  • @davidgillies620
    @davidgillies620 2 года назад +71

    Note that for RSA and similar, the modular multiplication operation itself can be quite expensive, so modern implementations typically convert the numbers involved to an intermediate representation, called a Montgomery form, after Peter Montgomery. The binary exponentiation method can use Montgomery forms throughout, so only at the end is the result converted back to a conventional representation. Montgomery multiplication is also resistant to the side channel attacks mentioned at the end of the video.

    • @andrewharrison8436
      @andrewharrison8436 2 года назад +14

      Now I have to look up Montgomery forms - or wait for the Computerphile video. I do like internet rabbit holes.

    • @locusf2
      @locusf2 2 года назад +5

      @@andrewharrison8436 also look up Montgomery Ladder which is the similar algorithm for elliptic curves

    • @Czeckie
      @Czeckie Год назад

      fascinating, I had no idea this exists

  • @todayonthebench
    @todayonthebench 2 года назад +63

    Interesting algorithm.
    At first I thought it were just going to be a simple, "first we build our list of binary equivalents and then just multiply them all together in the end."
    As an example, calculate 3^1, 3^2, 3^4, 3^8, 3^16, etc. And then choose the values our exponent actually contains.
    Then the slight of hands of mathematicians came in at 9:40 and made things far far simpler and much easier to execute in practice.

    • @pikasnoop6552
      @pikasnoop6552 2 года назад +13

      You might have noticed that Mike said that this was the left to right method. Yours (with taking the modulus) is the right to left variant and is just as quick.

  • @Alex_Deam
    @Alex_Deam 2 года назад +150

    9:34 It's actually not the minimum number of operations. For example, to make 31 by this method takes 8 operations (SMSMSMSM), whereas the minimum is only 7 operations (N^2, N*(N^2), (N^3)^2, (N^6)^2, (N^12)^2, (N^6)*(N^24), N*(N^30)). However, in general finding the minimum number for a given exponent is NP-complete, so in practice square and multiply is presumably what you'd do. Otherwise, great video!

    • @pikasnoop6552
      @pikasnoop6552 2 года назад +40

      The NP-completeness is a common misconception: this is only proven for sets of numbers, not single numbers. In practice I believe a window method is used, for which one precomputes some values so one can "combine" some multiplications.

    • @Alex_Deam
      @Alex_Deam 2 года назад +15

      @@pikasnoop6552 Thanks for the correction

    • @Wecoc1
      @Wecoc1 2 года назад +16

      Efficient exponentiation is a very interesting topic. The minimum number of multiplications required for N is an open problem in mathematics, you can read more about that on OEIS A003313, "Length of shortest addition chain for n".

    • @Skyb0rg
      @Skyb0rg 2 года назад +12

      That example seems to use more space (you need to remember N^6 until after you finish (N^12)^2). May be the minimum operations in fixed space, where the space is exactly the size of the input string.
      Also important for cryptographic libraries which shouldn’t be allocating memory dynamically.

    • @realKlabauterklaus
      @realKlabauterklaus 2 года назад +3

      If you introduce division as an additional operation, the example of 31 can be reduced to 6 operations: SSSSSD

  • @SRISWA007
    @SRISWA007 2 года назад +21

    This is also known as "Fast Binary Exponentiation", which calculates pow(a, b, mod) in logarithmic time.

  • @LeDabe
    @LeDabe 2 года назад +30

    Also called russian peasant multiplication. It works for any power operation tbh, not only scalar multiplication. The power operator on matrix can for instance be used to compute large fibonacci numbers very quickly using the matrix 2x2 [1, 1, 1, 0]

  • @ezg5221
    @ezg5221 2 года назад +22

    I read binary numbers left to right by starting at 1, doubling for each bit, and adding 1 if the bit was a 1. Very cool to see this pattern coming up in exponents

  • @hazemessawi2954
    @hazemessawi2954 2 года назад +8

    I love how entertaining the video is given that I already know what the answer is and have used this quite a lot

  • @longlostwraith5106
    @longlostwraith5106 2 года назад +10

    I always liked calculating that recursively. For example, 2^6 is (2^3)*(2^3), 2^3 is (2^2)*(2^1) and 2^2 is (2^1)*(2^1).
    It's extremely simple to code it too. Here's the algorithm that performs the calculation:
    1) If exponent is zero, return 1
    2) Divide exponent by two, and save both the quotient and the remainder
    3) Call algorithm recursively with (exponent = quotient) and save the result
    4) If remainder is zero, return result*result
    5) If remainder is one, return result*result*base

    • @canaDavid1
      @canaDavid1 Год назад +1

      Unless you cache the results, this is no faster than multiplying one-by-one (probably slower because of recursion overhead)

    • @longlostwraith5106
      @longlostwraith5106 Год назад

      @@canaDavid1 I don't think you appreciate the difference between O(N) and O(logN).

    • @schwingedeshaehers
      @schwingedeshaehers Год назад

      @@longlostwraith5106 you have O(2^log(N)) so O(N)

    • @longlostwraith5106
      @longlostwraith5106 Год назад

      @@schwingedeshaehers How, exactly? Are you taking the division into account?

    • @schwingedeshaehers
      @schwingedeshaehers Год назад

      @@longlostwraith5106 you have log n layers, but these layer get more and more calculations each level. And they get an exponential growth until the log n barrier from the amount of layers

  • @jkye_314
    @jkye_314 2 года назад +20

    I am currently purchasing master degree in cybersecurity and this guy summerize a 2h of lectures in literally 17min ;)

    • @QuantumHistorian
      @QuantumHistorian 2 года назад +3

      Why are you purchasing a degree? Maybe do it somewhere with better teaching then?

    • @ait-gacemnabil9181
      @ait-gacemnabil9181 2 года назад +8

      @@QuantumHistorian he probably meant pursuing

    • @johningham1880
      @johningham1880 2 года назад

      I’m afraid that is the model for university education these days

    • @jkye_314
      @jkye_314 2 года назад

      @@ait-gacemnabil9181yeah, you right. but, in some sense, it means actually a business activities for university.

    • @saiprasad8078
      @saiprasad8078 2 года назад

      In a way, he is right. Nowadays everything needs to be purchased -- even knowledge.

  • @MrGooglevideoviewer
    @MrGooglevideoviewer 2 года назад

    I love the step-by-step simplistic explanations you give and the focus on the core concepts. Thanks Mike! bloody champion! Peace and Love from Perth Australia!✌❤✌

  • @tsjbb
    @tsjbb Год назад +3

    This was fascinating, so simple and intuitive once explained but so powerful

  • @Muzer0
    @Muzer0 2 года назад +1

    Always wondered how the key reading timing/power attacks worked, that makes a lot of sense, cheers!

  • @levyroth
    @levyroth 2 года назад

    This is the coolest maths/CS video I've seen in a long time. Wow!

  • @LuciolaSama
    @LuciolaSama 2 года назад +1

    Dude, you’re such a fun guy to listen to. Keep it up, cheers!

  • @onlyeyeno
    @onlyeyeno 2 года назад +2

    I LOVE this type of content !!!! Thanks a million for making and sharing :)

  • @PopeLando
    @PopeLando 2 года назад

    Fantastic! I watched the same Numberphile video and did the high power mod p calculation on my calculator. And during the process I realised that to get the right number of squares, you turn the power into its binary number and then square the same number of times as the power of two. (And mod every time the answer is bigger than 747). I even checked it by finding the nearest actual primes, which are 743 and 751. Perfect 1s for both!

  • @NotAnAviator
    @NotAnAviator 2 года назад +1

    This video was a lovely reminder of my time spent with number theorists in college, cryptography is so damn fascinating

  • @luminous2585
    @luminous2585 2 года назад

    Thank you for this video. One of the most interesting things I've ever done in school, and I'd almost forgotten about it.

  • @chieeyeoh6204
    @chieeyeoh6204 Год назад

    This is just mind-blowing! Awesome video!

  • @touficjammoul4482
    @touficjammoul4482 Год назад

    You Sir saved my life before the exam, I can't thank you enough.

  • @diagorasofmel0s
    @diagorasofmel0s 2 года назад

    what a coincidence, i started studying RSA and y'all put out this banger, thanks Mike and Sean

  • @japedr
    @japedr 2 года назад +10

    This is also called "exponentiation by squaring" and it's super useful in many cases.
    One quick example is in computing the nth Fibonacci number using the 2x2 matrix formula, where one raises a matrix to the nth power. But using this method, the number of multiplications is greatly reduced. There is also a closed form expression using a power of the golden ratio but that requires a lot of numerical precision for large n.

  • @robertbrummayer4908
    @robertbrummayer4908 2 года назад

    Interesting algorithm and great video as usual

  • @meispi9457
    @meispi9457 2 года назад +7

    I remember using this algorithm for a competitive programming question on one of the codechef's monthly contests, didn't know it had a name.

  • @Richardincancale
    @Richardincancale 2 года назад

    The last minute was spot on - avoiding side attacks!

  • @gustavofring4788
    @gustavofring4788 2 года назад +2

    Truly interesting lesson, just studied this at school!

  • @matthewisrail
    @matthewisrail 2 года назад

    You guys and numberphile my 2 favorite channels

  • @4akat
    @4akat 2 года назад

    i love the channel. the slowness of the math demonstrations made me itchy!

  • @johnchessant3012
    @johnchessant3012 Год назад +1

    binary to decimal: go left to right, start from 0 and double if it's a 0 and double and add one if it's a 1. e.g. for 101010 you do 0 -> 1 -> 2 -> 5 -> 10 -> 21 -> 42. so 101010 = 42.
    decimal to binary: halve your number rounding down until you reach 1, e.g. 42 -> 21 -> 10 -> 5 -> 2 -> 1. now go backwards through this sequence and put a 1 if it's odd and put a 0 if it's even. so 42 = 101010.

  • @sean_vikoren
    @sean_vikoren 2 года назад +1

    1) You rock, thank you for making world better.
    2) Focus fail hurts eyes.

  • @b2bb
    @b2bb 2 года назад

    I know it was touched on toward the end of the video but I think a part II to this video where Dr. Pounds could go into a specific application example where this is used. Can always use more videos with him!

  • @conradludgate
    @conradludgate 2 года назад +17

    I did the 3^45 mod 7 in my head fairly simply. 3 and 7 are coprime, so you know that 3 will cycle through all 7 numbers. Then we can do 3^42 * 3^3 mod 7, which is just 1*3^3 or 27 mod 7 which is 6. Still a very useful algorithm though

    • @thenewnew1997
      @thenewnew1997 2 года назад +8

      Well, the algorithm you use is efficient for human, unfortunately computers don't see the same thing as us and know instinctively to do it, and it is just one particular case, this algorithm allows to be applied on all case scenario with the complexity of O(2*floor(log2(n)) +1) (worst case scenario, so big O) which n is the exponent of the number, so very efficient in terms of complexity. Anyways the method you use is very useful too, just for humans, not pc

    • @SimonBuchanNz
      @SimonBuchanNz 2 года назад +2

      In encryption, properties like this are what makes the selection of values so necessary. In this case, the modulo value is generally thousands of bits, while the base is either 3 or 65537 (and the exponent is the message and must be less than the modulo)

  • @Joe_Payne
    @Joe_Payne 2 года назад

    I literally submitted my rsa cryptography coursework in two weeks ago. This is all fresh in my mind. I'd love to see this go to the next step.

  • @user-vn9ld2ce1s
    @user-vn9ld2ce1s 2 года назад +6

    You could explain this much more easily and without binary like this:
    You take the exponents and apply two rules until you get to 1:
    - if it's odd, subtract one
    - if it's even, divide by two
    Then you just do the squares/multiplies in reverse order of these operations.

    • @Loldemord
      @Loldemord 2 года назад

      This is basically how you create the Binary Number out of the 10-base ^^ So its the same

    • @deanjohnson8233
      @deanjohnson8233 2 года назад +1

      That might explain it, but that is not how it would be programmed efficiently

    • @user-vn9ld2ce1s
      @user-vn9ld2ce1s 2 года назад

      @@Loldemord True

    • @user-vn9ld2ce1s
      @user-vn9ld2ce1s 2 года назад

      @@deanjohnson8233 That's probably true, if we're talking about something like assembly (those bit shifts are single opcodes, aren't they?), but if i were doing this is in python, it probably wouldn't matter...

    • @deanjohnson8233
      @deanjohnson8233 2 года назад

      @@user-vn9ld2ce1s you would implement it like this in assembly, c, c++, go, rust, c#, Java and many more. Bit shifting is not a rare and unusual thing.
      In python it would be strange because python does not have fixed numeric sizes. Using bitwise operations on something like that can easily lead to the wrong result if you don’t carefully study what Python does in various cases.
      Also, this video was about how to efficiently do this math. If you are concerned with the efficiency of math operations, you probably aren’t going to be using python.

  • @timsmith2525
    @timsmith2525 11 месяцев назад

    I love the idea of solving a complicated problem by solving a lot of simpler problems. Genius!

  • @thatcreole9913
    @thatcreole9913 2 года назад

    This was fantastic!

  • @anonymousvevo8697
    @anonymousvevo8697 3 месяца назад

    Amazing each time a watch your videos

  • @JivanPal
    @JivanPal 2 года назад

    Excellent video! Alternative summarised explanation: exponentiation in a sense "converts" addition to multiplication (see 3Blue1Brown's excellent intro to group theory and e^(iπ) = -1 for an exploration of this). The algorithm for converting a bitstring to a number (or equivalently, the binary representation of a number to its decimal representation) is to start with zero and read the number from left to right, doubling when you see a new digit, and then adding the value of that digit (i.e. add noting if it's "0", or add 1 / increment if it's "1"). For example, the binary number 101110101 is equal to decimal 373, as follows, reading the digits of the binary representation from left to right:
    • Start with 0.
    • Read a digit, "1": double, then increment, giving 1.
    • Read "0": double, giving 2.
    • Read "1": double, then increment, giving 5.
    • Read "1": double, then increment, giving 11.
    • Read "1": double, then increment, giving 23.
    • Read "0": double, giving 46.
    • Read "1": double, then increment, giving 93.
    • Read "0": double, giving 186.
    • Read "1": double, then increment, giving 373.
    The square and multiply algorithm just starts off with the base of the exponent (i.e. 23 as in the video) rather than 0, and replaces each doubling operation with a squaring, and each increment operation with a multiplication by the base. That is, exponentiation with base 23 has converted addition of 1 into multiplication by the base, 23. Likewise, doubling a number (which is the same as adding a number to itself) has been converting into squaring a number (which is the same as multiplying a number by itself).

  • @samharkness8861
    @samharkness8861 2 года назад

    Great video, thanks! He belongs in Numberphile videos too

  • @eggsquishit
    @eggsquishit 2 года назад +16

    You can do multiplication this way, too (by doubling & adding). Very useful on CPUs that can only do addition.

    • @trejkaz
      @trejkaz 2 года назад +2

      This is also how I've seen multiplication done on mechanical calculators.

    • @PvblivsAelivs
      @PvblivsAelivs 2 года назад

      You can. But it's faster to subtract squares. You have to build the table first. But you don't need multiplies to do it.

    • @canaDavid1
      @canaDavid1 Год назад +1

      @@PvblivsAelivs this depends on the speed of memory accesses, and how much memory space is available. But yes, table lookups are usually faster.

  • @sembutininverse
    @sembutininverse 2 года назад

    thank you guys🙏🏻🙏🏻🙏🏻🙏🏻🙏🏻, it was really insightful.
    ♥️

  • @applePrincess
    @applePrincess 2 года назад +3

    I love this (semi-)collaboration. You are computerphile version of Matt Parker in any way.

    • @benwisey
      @benwisey 2 года назад +2

      Matt Parker and Mike Pound. MP=MP.

  • @timholloway7413
    @timholloway7413 Год назад +1

    The one involving modulo 7 can be done relatively easily- as it’s prime we know 3^6 is congruent to 1 mod 7 ( Fermat’s little theorem ), then do 45 mod 6 and hence get to (3^6)^7*(3^3) mod 7 which is (1)^7*(3^3) mod 7 which is of course 6 mod 7.

  • @piiumlkj6497
    @piiumlkj6497 2 года назад +1

    This man is a legend

  • @johnsenchak1428
    @johnsenchak1428 2 года назад

    MIND BLOWING !

  • @gloverelaxis
    @gloverelaxis 2 года назад

    god Dr Pound is so good at explaining things

  • @dougfoo
    @dougfoo 2 года назад

    cool trick, didn't realize that relation
    i love this series

  • @lightyagmi4925
    @lightyagmi4925 2 года назад +1

    We call it binary exponent algorithm
    For example 3^10 =?
    we write its power in binary 10 = 1010
    The bits which are set will be included in the final ans as we can calculate all exponent which are powers of two very quickly.
    3^10= 3^8 * 3^2

  • @tdchayes
    @tdchayes 2 года назад +7

    It's true that using this algorithm on the private key exponent is more expensive than the specially chosen public exponent. (2048 bit exponent -> 4096 multiplies). However since for the RSA algorithm, the private key holder knows the factors used for the key, an algorithm based on the Chinese Remainder Theorem can reduce the cost of the private key operations.

    • @666Tomato666
      @666Tomato666 2 года назад

      yes, but CRT reduces the cost by a factor of about 3, so the private key operations are still slower than the public key operations which need to calculate power by a 16 bit number

  • @demonblood8841
    @demonblood8841 2 года назад +8

    This guy should have his own channel lol great stuff tho love it

  • @franziscoschmidt
    @franziscoschmidt 2 года назад +5

    Saw an implementation of this in a programming tutorial video but they just rushed over the details. Computerphile does a wonderful job at filling this gap (as always I might add!)

  • @richardyao9012
    @richardyao9012 Год назад

    I always did square and multiply from the last significant bit first. In C, this is:
    double pow(double x, int exp) {
    unsigned int e = (exp >= 0) ? exp : -exp;
    double result = 1.0;
    while (e) {
    if (e & 1) {
    result *= x;
    }
    x *= x;
    e >>= 1;
    }
    if (exp < 0)
    return (1.0 / result);
    return (result);
    }
    When I do it on paper by hand, I just calculate all squares first. Then I multiply every result corresponding to a 1 bit, starting from the least significant bit. Of course, the order in which I multiply does not matter, but it is how I always did it.

  • @estapeluo
    @estapeluo 2 года назад

    Waiting for those follow-up videos!

  • @wolfoftheair
    @wolfoftheair Год назад +1

    So, it turns out Square and Multiply on its own is not the greatest scheme for cryptography, because it lends itself to side channel attacks (timing and power usage).
    The way this is addressed is through a Montgomery Ladder, where every square operation is performed, and every multiply is performed, but the bit that determines whether it's a simple square or a multiplication actually determines where the output is placed. If it's intended to be used, it goes in the "correct" output location and mixed usefully in with the result. If it's not, it goes into an incorrect output location, and mixed in with all the other side-effect garbage from the function. This results in the power draw and time being constant, which defeats those side-channel attacks.

  • @irwainnornossa4605
    @irwainnornossa4605 2 года назад

    Amazing video, I almost want to incorportate it to my program.

  • @demon_hunter7905
    @demon_hunter7905 7 месяцев назад

    fooking genius mate

  • @wktodd
    @wktodd 2 года назад +1

    Mike Pound - Always good value 8-)

  • @cameronsteel6147
    @cameronsteel6147 2 года назад +3

    Such a cool method! The very fact that you can calculate 3^45 mod 7 on paper in a few minutes is awesome considering 3^45 has 22 digits!

    • @trejkaz
      @trejkaz 2 года назад +1

      You can do it faster. For instance, observe that 3^6 mod 7 is 1. So 3^45 mod 7 is going to be the same as 3^3 mod 7, which is 6.

    • @JivanPal
      @JivanPal 2 года назад

      @@trejkaz That depends on you knowing the totient of the modulus / order of the multiplicative group, which is hard if you don't know the prime factorisation of the modulus.

    • @trejkaz
      @trejkaz 2 года назад

      @@JivanPal I don't know much about groups at all and didn't really use any group theory to do that solution, just normal modular arithmetic. Although, in the video he did say that the modulus is usually prime for these examples, so I don't think I'd have too much trouble determining the factors either.

    • @JivanPal
      @JivanPal 2 года назад +1

      @@trejkaz It depends. The hardness of that factorisation problem is what gives RSA its security. The totient function, denoted φ(x), counts how many numbers less than x are coprime to x. It is such that φ(p) = p-1 where _p_ is any prime, and φ(ab) = φ(a)·φ(b) where _a_ and _b_ are any two integers. The encryptor's/signer's secret is a pair of large primes, _p_ and _q,_ that serve as the private key, and the public knowledge that serves as the public key is their product, _n_ = _pq._
      Thus, the encryptor/signer is always dealing with _p_ and _q,_ whereas the decryptor/verifier is always dealing with _n,_ whose prime factorisation he doesn't know. Without that knowledge, computing φ(n) is hard; with that knowledge, it is trivial: φ(n) = (p-1)(q-1). If he could figure out the prime factorisation, the encryption scheme is broken, precisely because he then knows φ(n) and can thus quickly compute these modular exponentials we're interested in: g^x mod n = g^[x mod φ(n)] mod n.

    • @thenewnew1997
      @thenewnew1997 2 года назад

      @@trejkaz can you generalized this method for all case scenario for computers? This method allows to be generalized for every case scenario with complexity of O(2*floor(log2(n))+1) and it is very efficient already (n being the exponent of the number to verify). Since I'm at it I'll also remind you that computer don't have instinct or intelligence like us

  • @michaelhunte743
    @michaelhunte743 Год назад

    Nice use of symmetry and multiplication.

  • @ricardoabh3242
    @ricardoabh3242 2 года назад

    Crazy impressive

  • @devonbraner4353
    @devonbraner4353 2 года назад

    cool video! cool algorithm!

  • @SillyMakesVids
    @SillyMakesVids 2 года назад

    That's a wicked smart algorithm.

  • @deekshantmalvi4612
    @deekshantmalvi4612 2 года назад

    Thanks man. ❤️❤️

  • @hemangchauhan2864
    @hemangchauhan2864 2 года назад

    This is really clever

  • @appropinquo3236
    @appropinquo3236 2 года назад

    This is really cool! I'm glad that i learned about binary numbers, because I wouldn't have been able to understand any of this otherwise.

  • @martixbg
    @martixbg 2 года назад

    This was a fascinating video even if the algorithm was rather obvious.

  • @KX36
    @KX36 2 года назад

    nice how you built in the eyes-glazing-over effect into the video so my eyes didn't have to this time like they have in some other videos (because things are complex, not because they're boring) :D

  • @SlimThrull
    @SlimThrull Год назад

    Huh. I was using a similar but substantially slower method. Good to know it can be improved upon.

  • @MM-by6qq
    @MM-by6qq Год назад

    thank you!!

  • @jotrockenmitlocken
    @jotrockenmitlocken 10 месяцев назад

    Very helpful.

  • @KlaasDeSmedt
    @KlaasDeSmedt 2 года назад

    12:55 you can work backwards: if it's odd, subtract 1, if it's even, devide by 2 ;)

  • @zyghom
    @zyghom 2 года назад

    math is amazing, especially if used in the correct way ;-) you guys are also AMAZING! ;)

  • @nodroGnotlrahC
    @nodroGnotlrahC 2 года назад +4

    Basically Russian Multiplication (covered by Johnny Ball on Numberphile), but square and multiply instead of double and add. Surprised that wasn't mentioned.

    • @JivanPal
      @JivanPal 2 года назад

      Indeed! That is the basis of a common efficient algorithm for converting string representations of integers expressed in base _n_ into actual integer datatypes, too, e.g. for decimal in C:
      char* input_string = "285657";
      int result = 0;
      for (char* c = input_string; c != NULL; c++) {
      result += *c - '0';
      result *= 10;
      }
      Or for capitalised hexadecimal:
      char* input_string = "45BD9";
      int result = 0;
      for (char* c = input_string; c != NULL; c++) {
      result += isdigit(*c) ? *c - '0' : 10 + *c - 'A';
      result *= 16;
      }

  • @nickdunstone
    @nickdunstone 8 месяцев назад

    Yet again I see 65537 which coincidentally is my favourite number too! It's the 17 bit big brother of 257.

  • @KaneYork
    @KaneYork 2 года назад

    Are you going to make a followup talking about addition chains?

  • @drskelebone
    @drskelebone 2 года назад

    Is there a video about modulo math commuting for both addition and multiplication? I don't remember one, and it seems to be explicitly required here.

    • @JivanPal
      @JivanPal 2 года назад

      What do you mean by "commuting" here?

  • @abdallahegniia1672
    @abdallahegniia1672 6 месяцев назад

    A comment that i've liked
    "This man forgot things about computers more than what i will ever learn"

  • @anon_y_mousse
    @anon_y_mousse 2 года назад

    Being half asleep when watching this, for a moment when he was going over all the steps I thought I was nodding off, but nope, it was just the camera. Perhaps get the camera some coffee in the future.

  • @bwill325
    @bwill325 2 года назад

    I always wondered how we dealt with such enormous numbers

  • @Tristoo
    @Tristoo 2 года назад

    the timing/power thing at the end is called a side channel attack

  • @jeremyahagan
    @jeremyahagan 2 года назад

    Does this represent the smallest number of steps for any given exponent?

  • @edwealleans
    @edwealleans 2 года назад

    A cool video and topic but I would like to nit pic the camera placement when Dr Mike writes on his paper. Surely the camera could have been on his left side?

  • @Alecu100
    @Alecu100 2 года назад

    A similar algorithm can be applied for divisions.

  • @pdrg
    @pdrg 2 года назад

    What an interesting algorithm

  • @saultube44
    @saultube44 Год назад

    This could be used to compress files, it would need serious modifications, but it has potential

  • @heaslyben
    @heaslyben 2 года назад

    I didn't learn this one at school. It's gorgeous! Thank you.

  • @snack711
    @snack711 Год назад

    math rules, i love these kind of tricks

  • @misterkite
    @misterkite Год назад

    There are a couple of Project Euler questions that this will help solve.

  • @greatestever2914
    @greatestever2914 2 года назад

    yeah! in my assembly class, when we had to program, there is no multiplication operator, so you'd have to shift the bits of the binary value in the registers to actually multiply two numbers. Don't even get me started with bringing in value into registers, and these values can't be stored in just any registers... and then pulling the value out.. storing elsewhere... god bless ..

  • @j7ndominica051
    @j7ndominica051 2 года назад

    When making a really big number, if you can't do the intermediary mod trick, how would the computer handle overflow to another word?

    • @nebuleon
      @nebuleon 2 года назад +1

      If you can't do "mod" at any step of the way, you have to allocate enough memory for a multi-word number having, as a number of bits, at least the sum of the number of bits in both multiplicands at every multiplication.
      For example, if you're at a point where base^64 is 415 bits, you need at least 830 bits for a square (base^64 x base^64), since both multiplicands are 415 bits and 415 + 415 = 830. Then the multiplication proceeds as usual: on a 64-bit computer, bits 63 to 0 of each multiplicand contribute to partial sums at bits 127 to 0 of the result, and so on, until bits 447 to 384 contribute to partial sums at bits 895 to 768 of the result.
      You could always pre-allocate enough bits for the entire number in advance. Then you would execute a modified square and multiply algorithm that just calculates how many bits you're likely to need to hold the final result given all the multiplications involved, allocate that (say it's 16190 bits), and execute the proper square and multiply in multi-word arithmetic on 16190 bits.

  • @ksc91u
    @ksc91u 2 года назад

    Could you make a video about opaque.

  • @impulsiveDecider
    @impulsiveDecider 2 года назад

    FINALLY SOMETHING I CAN APPLY MY MATH III KNOWLEDGE ON.

  • @vikingthedude
    @vikingthedude 2 года назад

    So is modulo distributive over multiplication? Is that why we can keep the numbers small?

  • @andrewjknott
    @andrewjknott 2 года назад +2

    5:24 - cleaner explanation to convert 45 -> 101101 -> 32 + 8 + 4 + 1.

    • @NoNameAtAll2
      @NoNameAtAll2 2 года назад +1

      that's backward of this algorithm
      you did calculation of powers of 2 and multiply them
      in the video 101101 -> (((((1)*2+0)*2+1)*2+1)*2+0)*2+1

    • @JivanPal
      @JivanPal 2 года назад

      @@NoNameAtAll2 Or in postfix notation to avoid all those parentheses: 1 2× 0+ 2× 1+ 2× 1+ 2× 0+ 2× 1+.

    • @NoNameAtAll2
      @NoNameAtAll2 2 года назад

      @@JivanPal why not prefix then?
      + * + * + * + * + * 1 2 0 2 1 2 1 2 0 2 1
      :)

  • @christopherg2347
    @christopherg2347 Год назад

    Square is ab it limit Multiplication, while Multiplication is a bit like addition.
    Just in the amount it will change the overall result.

  • @ujjawalsinha8968
    @ujjawalsinha8968 2 года назад

    Intresting, so a different base (instead of 2) can give different performances. For base = 3, will it be called cube, square and multiply algorithm?

    • @JivanPal
      @JivanPal 2 года назад

      It is true that in base _n,_ you will determine which of _n_ operations to do (e.g. for _n_ = 3, these are multiply, square, or cube). However, cubing is just multiplying a number by itself twice, so _n_ = 3 actually gives *_worse_* performance. In fact, _n_ ≥ 4 also gives worse performance, since what would be squaring twice in the _n_ = 2 case would become multiplying a number by itself four times. Thus, _n_ = 2 gives the best average performance.
      Even more generally, square and multiply does not give the best possible performance / shortest possible algorithm for any given exponent, but it is the best we can do without solving a harder problem for the exponent first, which on average would give us much worse performance. It is this: if we know the totient of the modulus _m_ (or equivalently, the order of the multiplicative group { g^x mod m | x ∈ 𝐙 }, where _g_ is the generator of the group, i.e. the base of the modular exponential we're trying to compute, which was 23 in the video), which is denoted φ(m), then we have g^[φ(m)] mod m = 1 (by an extension of Fermat's Little Theorem), and so
      g^x mod m
      = g^[q φ(m) + r] mod m
      = ( g^[φ(m)] ^ q ) g^r mod m
      = g^r mod m,
      where _q_ and _r_ are the quotient and remainder of _x_ divided by φ(m), respectively. However, computing φ(m) is hard unless the prime factorisation of _m_ is known, so in practice this is not used often. The hardness of this problem is what underpins the security of RSA.

  • @kdawg3484
    @kdawg3484 2 года назад

    Mike Pound for Numberphile video, please.