Square & Multiply Algorithm - Computerphile

Computerphile

Просмотров 281 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 1 фев 2025

Комментарии • 311

@pleasedontwatchthese9593 2 года назад ⁺⁴⁸¹
For the people who have worked with assembly programming they will be really use to these. In the past CPUs did not have multiply and you often had a table of the fastest way to multiply a numbers. Which you guessed it was shifts (which is like a square) and addition
@NoNameAtAll2 2 года назад ⁺³³
even modern cpus have shortcuts for squaring and for multiplying by small numbers, so this algo is still benefitial
@klaxoncow 2 года назад ⁺²⁷
Johnny Ball covered this on Numberphile before - search "Russian multiplication".
Let's call the two numbers A and B.
With number A, we shift all the bits to the right once.
The rightmost bit "falls out" of the number and typically gets shifted into a flag on most CPUs - let's say our CPU shifts the bit out of the right into the carry flag.
So, the carry flag now contains the rightmost (least significant) bit of number A. If it's a one, then we add number B to our running total. If it's a zero, carry on (so you could code that as a "branch if carry not set" over a "add B to total" instruction).
Then we shift B left one, which doubles it.
Then we do it again. Shift A to the right. The rightmost bit falls out into the carry flag. If it's a one, then we add B (which we've just doubled, remember) to the running total.
Shift B left one, doubling it again.
Shift A to the right. If it's a one, then add B (now four times bigger than it originally was, as we're shifting it left every round) to the running total.
You keep doing this until you've shifted every original bit of A out of the right side. Stop. You've multiplied the two numbers together and your answer is in the running total register.
And all we did was shifting bits left and right, and simple addition. And there's only as many "rounds" of this as there are bits in A.
Another bit of useful binary maths is that the result cannot be more than twice the number of bits in B. That is, if A and B are 8-bit numbers, the result register only needs to be 16-bits - as it's just not possible for two 8-bit numbers to multiply to more than a 16-bit number.
Indeed, if you're implementing this algorithm, then stick B in a register with twice as many bits and have your "running total" register be twice as many bits. Then you can run the algorithm blindly.
Which is, as you may have guessed, what the CPU's actually doing in the circuitry with a hardware multiply.
It's just multiplying by 2 - which, in binary, is nothing more than shifting all the bits to the left once - and addition.
So, yeah, it's basically the same algorithm as this video, but working in a higher order. So multiplication -> exponential. And multiplying by 2 -> squaring. And addition -> multiplication.
@RegrinderAlert 2 года назад ⁺³
@@NoNameAtAll2 Are those tables actually part of the CPU (making use of microcode) or done by a compiler?
@NoNameAtAll2 2 года назад ⁺¹
@@RegrinderAlert multipliers are simple enough to be just logic gates
what I was talking about is "check if top 48 bits are 0, so we don't need to wait/use most of the circuit"
that is too, simple enough to be just logic
about compilers... idk how common the explicit "square" command is in processors
@FrankHarwald 2 года назад ⁺⁹
yes, except that's a shift & add algorithm, & shifts aren't like squaring but like multiplying by a power of 2.
@GeorgeBratley 2 года назад ⁺³⁹²
I think the last bit of the video is facinating - that you could perform an attack to work out a key based on the CPU time to calculate a square vs a square & multipy. A great example of the theoretical mathematics being ideal vs. the real world implementation being fundamentally vulnerable.
@nebuleon 2 года назад ⁺⁵⁵
Yes! And the technical term for it is a "timing attack".
Timing attacks can be so insidious that you need to resort to assembly language just to get everything out of the way.
Dr Pound's example has us doing an "always multiply", multiplying by one (the multiplication identity) after squaring for an unset bit in the exponent, so that we execute the algorithm in constant time. However, using a statement like [if (bit is zero) { multiplicand = one; } else { multiplicand = base; }] to do this "always multiplication" can end up in the *branch predictor's way.* If there are lots of zeroes in your key, it's going to take the "if" path more of the time; conversely, if there are lots of ones in your key, it's going to take the "else" path more of the time. Either way, the branch predictor will execute it faster than if the key were evenly-distributed ones and zeroes.
To execute *that* in constant time, the CPU has to have a branchless test instruction to set the new multiplicand. Either you validate that the compiler uses the branchless instruction in your C code (say), or you write at least that part of the algorithm in assembly language.
Edit: Or use the Montgomery form of the numbers, per David Gillies's comment, which makes it easier to have constant-time algorithms
@2Cerealbox 2 года назад ⁺²⁵
In data centers that have servers that the government uses, the government requires that their servers are plugged into an air-gapped power supply, unconnected to the power that every other server uses, so that a spy couldn't surreptitiously measure changes in their power usage. These are surprisingly effective attacks.
@mbican 2 года назад ⁺¹⁸
That's why cryptographic implementation need to have constant time, no optimization allowed for multiplication by zero.
@domogdeilig 2 года назад ⁺¹
@@nebuleon Multiply by base^binary. Thus if there is a 0, it will be multiplied by 1, and for 1 it's the ordinary multiply. As both x^n where n is 0 or 1 is easily calculated that should be same time (?).
@insulince 2 года назад ⁺²
Why not just simply add a random sleep at the end of the algorithm? The size of the sleep would be a question, but if it’s a random amount each time that is sufficiently large to mask any work being done (or lack there of), it would remove this timing attack issue. It’s certainly wasteful to simply sleep, but it’s also wasteful to do unnecessary calculations to remain in constant time, no?
@davidgillies620 2 года назад ⁺⁷⁴
Note that for RSA and similar, the modular multiplication operation itself can be quite expensive, so modern implementations typically convert the numbers involved to an intermediate representation, called a Montgomery form, after Peter Montgomery. The binary exponentiation method can use Montgomery forms throughout, so only at the end is the result converted back to a conventional representation. Montgomery multiplication is also resistant to the side channel attacks mentioned at the end of the video.
@andrewharrison8436 2 года назад ⁺¹⁴
Now I have to look up Montgomery forms - or wait for the Computerphile video. I do like internet rabbit holes.
@locusf2 2 года назад ⁺⁵
@@andrewharrison8436 also look up Montgomery Ladder which is the similar algorithm for elliptic curves
@Czeckie 2 года назад
fascinating, I had no idea this exists
@thuokagiri5550 2 года назад ⁺²⁰⁸
Dr Pounds breadth and depth of knowledge in computer science never cease to amaze me!!
"Man from the future"
@Ins4n1ty_ 2 года назад ⁺⁷
Absolutely, but this specific piece of knowledge is pretty much common knowledge for anyone in CS. I studied this in college about 7 years ago, it was NOT a fun time since we had a pretty bad professor...
@quincy2142 2 года назад ⁺³
Not necessarily the breadth, but connecting the theoretical with the practical. Bit on timing attacks was pretty nice.
@thuokagiri5550 2 года назад ⁺²
@@quincy2142 he has a very impeccable pedagogy
@jaydeep-p 2 года назад
Even if he doesn't have the knowledge I still like his teaching style, it's engaging and fun.
@todayonthebench 2 года назад ⁺⁶⁶
Interesting algorithm.
At first I thought it were just going to be a simple, "first we build our list of binary equivalents and then just multiply them all together in the end."
As an example, calculate 3^1, 3^2, 3^4, 3^8, 3^16, etc. And then choose the values our exponent actually contains.
Then the slight of hands of mathematicians came in at 9:40 and made things far far simpler and much easier to execute in practice.
@pikasnoop6552 2 года назад ⁺¹³
You might have noticed that Mike said that this was the left to right method. Yours (with taking the modulus) is the right to left variant and is just as quick.
@Alex_Deam 2 года назад ⁺¹⁵⁴
9:34 It's actually not the minimum number of operations. For example, to make 31 by this method takes 8 operations (SMSMSMSM), whereas the minimum is only 7 operations (N^2, N*(N^2), (N^3)^2, (N^6)^2, (N^12)^2, (N^6)*(N^24), N*(N^30)). However, in general finding the minimum number for a given exponent is NP-complete, so in practice square and multiply is presumably what you'd do. Otherwise, great video!
@pikasnoop6552 2 года назад ⁺⁴¹
The NP-completeness is a common misconception: this is only proven for sets of numbers, not single numbers. In practice I believe a window method is used, for which one precomputes some values so one can "combine" some multiplications.
@Alex_Deam 2 года назад ⁺¹⁵
@@pikasnoop6552 Thanks for the correction
@Wecoc1 2 года назад ⁺¹⁶
Efficient exponentiation is a very interesting topic. The minimum number of multiplications required for N is an open problem in mathematics, you can read more about that on OEIS A003313, "Length of shortest addition chain for n".
@Skyb0rg 2 года назад ⁺¹²
That example seems to use more space (you need to remember N^6 until after you finish (N^12)^2). May be the minimum operations in fixed space, where the space is exactly the size of the input string.
Also important for cryptographic libraries which shouldn’t be allocating memory dynamically.
@realKlabauterklaus 2 года назад ⁺³
If you introduce division as an additional operation, the example of 31 can be reduced to 6 operations: SSSSSD
@ezg5221 2 года назад ⁺²²
I read binary numbers left to right by starting at 1, doubling for each bit, and adding 1 if the bit was a 1. Very cool to see this pattern coming up in exponents
@LeDabe 2 года назад ⁺³¹
Also called russian peasant multiplication. It works for any power operation tbh, not only scalar multiplication. The power operator on matrix can for instance be used to compute large fibonacci numbers very quickly using the matrix 2x2 [1, 1, 1, 0]
@SRISWA007 2 года назад ⁺²²
This is also known as "Fast Binary Exponentiation", which calculates pow(a, b, mod) in logarithmic time.
@hazemessawi2954 2 года назад ⁺⁸
I love how entertaining the video is given that I already know what the answer is and have used this quite a lot
@longlostwraith5106 2 года назад ⁺¹¹
I always liked calculating that recursively. For example, 2^6 is (2^3)*(2^3), 2^3 is (2^2)*(2^1) and 2^2 is (2^1)*(2^1).
It's extremely simple to code it too. Here's the algorithm that performs the calculation:
1) If exponent is zero, return 1
2) Divide exponent by two, and save both the quotient and the remainder
3) Call algorithm recursively with (exponent = quotient) and save the result
4) If remainder is zero, return result*result
5) If remainder is one, return result*result*base
@canaDavid1 2 года назад ⁺¹
Unless you cache the results, this is no faster than multiplying one-by-one (probably slower because of recursion overhead)
@longlostwraith5106 2 года назад
@@canaDavid1 I don't think you appreciate the difference between O(N) and O(logN).
@schwingedeshaehers 2 года назад
@@longlostwraith5106 you have O(2^log(N)) so O(N)
@longlostwraith5106 2 года назад
@@schwingedeshaehers How, exactly? Are you taking the division into account?
@schwingedeshaehers 2 года назад
@@longlostwraith5106 you have log n layers, but these layer get more and more calculations each level. And they get an exponential growth until the log n barrier from the amount of layers
@japedr 2 года назад ⁺¹⁰
This is also called "exponentiation by squaring" and it's super useful in many cases.
One quick example is in computing the nth Fibonacci number using the 2x2 matrix formula, where one raises a matrix to the nth power. But using this method, the number of multiplications is greatly reduced. There is also a closed form expression using a power of the golden ratio but that requires a lot of numerical precision for large n.
@jkye_314 2 года назад ⁺²⁰
I am currently purchasing master degree in cybersecurity and this guy summerize a 2h of lectures in literally 17min ;)
@QuantumHistorian 2 года назад ⁺³
Why are you purchasing a degree? Maybe do it somewhere with better teaching then?
@ait-gacemnabil9181 2 года назад ⁺¹⁰
@@QuantumHistorian he probably meant pursuing
@johningham1880 2 года назад
I’m afraid that is the model for university education these days
@jkye_314 2 года назад
@@ait-gacemnabil9181yeah, you right. but, in some sense, it means actually a business activities for university.
@saiprasad8078 2 года назад
In a way, he is right. Nowadays everything needs to be purchased -- even knowledge.
@NotAnAviator 2 года назад ⁺¹
This video was a lovely reminder of my time spent with number theorists in college, cryptography is so damn fascinating
@diagorasofmel0s 2 года назад
what a coincidence, i started studying RSA and y'all put out this banger, thanks Mike and Sean
@karanjotsinghbagga2097 8 месяцев назад
i cant believe how brilliant this explanation actually was!!Kudoss
@tsjbb 2 года назад ⁺³
This was fascinating, so simple and intuitive once explained but so powerful
@MrGooglevideoviewer 2 года назад
I love the step-by-step simplistic explanations you give and the focus on the core concepts. Thanks Mike! bloody champion! Peace and Love from Perth Australia!✌❤✌
@OriginalBlisz2-rv2jl 2 месяца назад ⁺¹
We actually did this in school😅 This really helped me understand it
Thank you very much!
@PopeLando 2 года назад
Fantastic! I watched the same Numberphile video and did the high power mod p calculation on my calculator. And during the process I realised that to get the right number of squares, you turn the power into its binary number and then square the same number of times as the power of two. (And mod every time the answer is bigger than 747). I even checked it by finding the nearest actual primes, which are 743 and 751. Perfect 1s for both!
@touficjammoul4482 2 года назад
You Sir saved my life before the exam, I can't thank you enough.
@Joe_Payne 2 года назад
I literally submitted my rsa cryptography coursework in two weeks ago. This is all fresh in my mind. I'd love to see this go to the next step.
@sugarsugar4410 2 года назад
Whats the next step?
@JivanPal 2 года назад
Excellent video! Alternative summarised explanation: exponentiation in a sense "converts" addition to multiplication (see 3Blue1Brown's excellent intro to group theory and e^(iπ) = -1 for an exploration of this). The algorithm for converting a bitstring to a number (or equivalently, the binary representation of a number to its decimal representation) is to start with zero and read the number from left to right, doubling when you see a new digit, and then adding the value of that digit (i.e. add noting if it's "0", or add 1 / increment if it's "1"). For example, the binary number 101110101 is equal to decimal 373, as follows, reading the digits of the binary representation from left to right:
• Start with 0.
• Read a digit, "1": double, then increment, giving 1.
• Read "0": double, giving 2.
• Read "1": double, then increment, giving 5.
• Read "1": double, then increment, giving 11.
• Read "1": double, then increment, giving 23.
• Read "0": double, giving 46.
• Read "1": double, then increment, giving 93.
• Read "0": double, giving 186.
• Read "1": double, then increment, giving 373.
The square and multiply algorithm just starts off with the base of the exponent (i.e. 23 as in the video) rather than 0, and replaces each doubling operation with a squaring, and each increment operation with a multiplication by the base. That is, exponentiation with base 23 has converted addition of 1 into multiplication by the base, 23. Likewise, doubling a number (which is the same as adding a number to itself) has been converting into squaring a number (which is the same as multiplying a number by itself).
@franziscoschmidt 2 года назад ⁺⁵
Saw an implementation of this in a programming tutorial video but they just rushed over the details. Computerphile does a wonderful job at filling this gap (as always I might add!)
@johnchessant3012 2 года назад ⁺¹
binary to decimal: go left to right, start from 0 and double if it's a 0 and double and add one if it's a 1. e.g. for 101010 you do 0 -> 1 -> 2 -> 5 -> 10 -> 21 -> 42. so 101010 = 42.
decimal to binary: halve your number rounding down until you reach 1, e.g. 42 -> 21 -> 10 -> 5 -> 2 -> 1. now go backwards through this sequence and put a 1 if it's odd and put a 0 if it's even. so 42 = 101010.
@LuciolaSama 2 года назад ⁺¹
Dude, you’re such a fun guy to listen to. Keep it up, cheers!
@levyroth 2 года назад
This is the coolest maths/CS video I've seen in a long time. Wow!
@timsmith2525 Год назад
I love the idea of solving a complicated problem by solving a lot of simpler problems. Genius!
@meispi9457 2 года назад ⁺⁷
I remember using this algorithm for a competitive programming question on one of the codechef's monthly contests, didn't know it had a name.
@alicebobson2868 2 года назад
i did this aswell in codeforces
@4akat 2 года назад
i love the channel. the slowness of the math demonstrations made me itchy!
@applePrincess 2 года назад ⁺³
I love this (semi-)collaboration. You are computerphile version of Matt Parker in any way.
@benwisey 2 года назад ⁺²
Matt Parker and Mike Pound. MP=MP.
@luminous2585 2 года назад
Thank you for this video. One of the most interesting things I've ever done in school, and I'd almost forgotten about it.
@conradludgate 2 года назад ⁺¹⁸
I did the 3^45 mod 7 in my head fairly simply. 3 and 7 are coprime, so you know that 3 will cycle through all 7 numbers. Then we can do 3^42 * 3^3 mod 7, which is just 1*3^3 or 27 mod 7 which is 6. Still a very useful algorithm though
@thenewnew1997 2 года назад ⁺⁸
Well, the algorithm you use is efficient for human, unfortunately computers don't see the same thing as us and know instinctively to do it, and it is just one particular case, this algorithm allows to be applied on all case scenario with the complexity of O(2*floor(log2(n)) +1) (worst case scenario, so big O) which n is the exponent of the number, so very efficient in terms of complexity. Anyways the method you use is very useful too, just for humans, not pc
@SimonBuchanNz 2 года назад ⁺²
In encryption, properties like this are what makes the selection of values so necessary. In this case, the modulo value is generally thousands of bits, while the base is either 3 or 65537 (and the exponent is the message and must be less than the modulo)
@cameronsteel6147 2 года назад ⁺³
Such a cool method! The very fact that you can calculate 3^45 mod 7 on paper in a few minutes is awesome considering 3^45 has 22 digits!
@trejkaz 2 года назад ⁺¹
You can do it faster. For instance, observe that 3^6 mod 7 is 1. So 3^45 mod 7 is going to be the same as 3^3 mod 7, which is 6.
@JivanPal 2 года назад
@@trejkaz That depends on you knowing the totient of the modulus / order of the multiplicative group, which is hard if you don't know the prime factorisation of the modulus.
@trejkaz 2 года назад
@@JivanPal I don't know much about groups at all and didn't really use any group theory to do that solution, just normal modular arithmetic. Although, in the video he did say that the modulus is usually prime for these examples, so I don't think I'd have too much trouble determining the factors either.
@JivanPal 2 года назад ⁺¹
@@trejkaz It depends. The hardness of that factorisation problem is what gives RSA its security. The totient function, denoted φ(x), counts how many numbers less than x are coprime to x. It is such that φ(p) = p-1 where _p_ is any prime, and φ(ab) = φ(a)·φ(b) where _a_ and _b_ are any two integers. The encryptor's/signer's secret is a pair of large primes, _p_ and _q,_ that serve as the private key, and the public knowledge that serves as the public key is their product, _n_ = _pq._
Thus, the encryptor/signer is always dealing with _p_ and _q,_ whereas the decryptor/verifier is always dealing with _n,_ whose prime factorisation he doesn't know. Without that knowledge, computing φ(n) is hard; with that knowledge, it is trivial: φ(n) = (p-1)(q-1). If he could figure out the prime factorisation, the encryption scheme is broken, precisely because he then knows φ(n) and can thus quickly compute these modular exponentials we're interested in: g^x mod n = g^[x mod φ(n)] mod n.
@thenewnew1997 2 года назад
@@trejkaz can you generalized this method for all case scenario for computers? This method allows to be generalized for every case scenario with complexity of O(2*floor(log2(n))+1) and it is very efficient already (n being the exponent of the number to verify). Since I'm at it I'll also remind you that computer don't have instinct or intelligence like us
@tdchayes 2 года назад ⁺⁷
It's true that using this algorithm on the private key exponent is more expensive than the specially chosen public exponent. (2048 bit exponent -> 4096 multiplies). However since for the RSA algorithm, the private key holder knows the factors used for the key, an algorithm based on the Chinese Remainder Theorem can reduce the cost of the private key operations.
@666Tomato666 2 года назад
yes, but CRT reduces the cost by a factor of about 3, so the private key operations are still slower than the public key operations which need to calculate power by a 16 bit number
@Muzer0 2 года назад ⁺¹
Always wondered how the key reading timing/power attacks worked, that makes a lot of sense, cheers!
@user-vn9ld2ce1s 2 года назад ⁺⁶
You could explain this much more easily and without binary like this:
You take the exponents and apply two rules until you get to 1:
- if it's odd, subtract one
- if it's even, divide by two
Then you just do the squares/multiplies in reverse order of these operations.
@Loldemord 2 года назад
This is basically how you create the Binary Number out of the 10-base ^^ So its the same
@deanjohnson8233 2 года назад ⁺¹
That might explain it, but that is not how it would be programmed efficiently
@user-vn9ld2ce1s 2 года назад
@@Loldemord True
@user-vn9ld2ce1s 2 года назад
@@deanjohnson8233 That's probably true, if we're talking about something like assembly (those bit shifts are single opcodes, aren't they?), but if i were doing this is in python, it probably wouldn't matter...
@deanjohnson8233 2 года назад
@@user-vn9ld2ce1s you would implement it like this in assembly, c, c++, go, rust, c#, Java and many more. Bit shifting is not a rare and unusual thing.
In python it would be strange because python does not have fixed numeric sizes. Using bitwise operations on something like that can easily lead to the wrong result if you don’t carefully study what Python does in various cases.
Also, this video was about how to efficiently do this math. If you are concerned with the efficiency of math operations, you probably aren’t going to be using python.
@sean_vikoren 2 года назад ⁺¹
1) You rock, thank you for making world better.
2) Focus fail hurts eyes.
@timholloway7413 2 года назад ⁺¹
The one involving modulo 7 can be done relatively easily- as it’s prime we know 3^6 is congruent to 1 mod 7 ( Fermat’s little theorem ), then do 45 mod 6 and hence get to (3^6)^7*(3^3) mod 7 which is (1)^7*(3^3) mod 7 which is of course 6 mod 7.
@gustavofring4788 2 года назад ⁺²
Truly interesting lesson, just studied this at school!
@b2bb 2 года назад
I know it was touched on toward the end of the video but I think a part II to this video where Dr. Pounds could go into a specific application example where this is used. Can always use more videos with him!
@onlyeyeno 2 года назад ⁺²
I LOVE this type of content !!!! Thanks a million for making and sharing :)
@matthewisrail 2 года назад
You guys and numberphile my 2 favorite channels
@wolfoftheair 2 года назад ⁺¹
So, it turns out Square and Multiply on its own is not the greatest scheme for cryptography, because it lends itself to side channel attacks (timing and power usage).
The way this is addressed is through a Montgomery Ladder, where every square operation is performed, and every multiply is performed, but the bit that determines whether it's a simple square or a multiplication actually determines where the output is placed. If it's intended to be used, it goes in the "correct" output location and mixed usefully in with the result. If it's not, it goes into an incorrect output location, and mixed in with all the other side-effect garbage from the function. This results in the power draw and time being constant, which defeats those side-channel attacks.
@eggsquishit 2 года назад ⁺¹⁷
You can do multiplication this way, too (by doubling & adding). Very useful on CPUs that can only do addition.
@trejkaz 2 года назад ⁺²
This is also how I've seen multiplication done on mechanical calculators.
@PvblivsAelivs 2 года назад
You can. But it's faster to subtract squares. You have to build the table first. But you don't need multiplies to do it.
@canaDavid1 2 года назад ⁺¹
@@PvblivsAelivs this depends on the speed of memory accesses, and how much memory space is available. But yes, table lookups are usually faster.
@lightyagmi4925 2 года назад ⁺¹
We call it binary exponent algorithm
For example 3^10 =?
we write its power in binary 10 = 1010
The bits which are set will be included in the final ans as we can calculate all exponent which are powers of two very quickly.
3^10= 3^8 * 3^2
@richardyao9012 2 года назад
I always did square and multiply from the last significant bit first. In C, this is:
double pow(double x, int exp) {
unsigned int e = (exp >= 0) ? exp : -exp;
double result = 1.0;
while (e) {
if (e & 1) {
result *= x;
}
x *= x;
e >>= 1;
}
if (exp < 0)
return (1.0 / result);
return (result);
}
When I do it on paper by hand, I just calculate all squares first. Then I multiply every result corresponding to a 1 bit, starting from the least significant bit. Of course, the order in which I multiply does not matter, but it is how I always did it.
@Richardincancale 2 года назад
The last minute was spot on - avoiding side attacks!
@jw8573speed Год назад
This is just mind-blowing! Awesome video!
@demonblood8841 2 года назад ⁺⁸
This guy should have his own channel lol great stuff tho love it
@vijaysamant2864 Год назад
For people not understanding binary, here is a way to do it without using binary:
1)Let the power be x,
2a)If x is odd subtract 1 from it and write M(multiply) on a paper
2b)If x is even half it and write S(square) on paper
3) with the new x(either half or one less), repeat step 1) until x becomes 1
4) if x is 1, do the steps shown in the video, but start from the bottom of the list
For example: For 373,
M 372
S 186
S 93
M 92
S 46
S 23
M 22
S 11
M 10
S 5
M 4
S 2
S 1
Then the correct step would be S S M S M S M S S M S S M(this is just the order of operations, you would still have to do all the mod and other things)
Note: S=Square; M=Multiply
@nodroGnotlrahC 2 года назад ⁺⁴
Basically Russian Multiplication (covered by Johnny Ball on Numberphile), but square and multiply instead of double and add. Surprised that wasn't mentioned.
@JivanPal 2 года назад
Indeed! That is the basis of a common efficient algorithm for converting string representations of integers expressed in base _n_ into actual integer datatypes, too, e.g. for decimal in C:
char* input_string = "285657";
int result = 0;
for (char* c = input_string; c != NULL; c++) {
result += *c - '0';
result *= 10;
}
Or for capitalised hexadecimal:
char* input_string = "45BD9";
int result = 0;
for (char* c = input_string; c != NULL; c++) {
result += isdigit(*c) ? *c - '0' : 10 + *c - 'A';
result *= 16;
}
@robertbrummayer4908 2 года назад
Interesting algorithm and great video as usual
@gloverelaxis 2 года назад
god Dr Pound is so good at explaining things
@samharkness8861 2 года назад
Great video, thanks! He belongs in Numberphile videos too
@michaelhunte743 2 года назад
Nice use of symmetry and multiplication.
@KlaasDeSmedt 2 года назад
12:55 you can work backwards: if it's odd, subtract 1, if it's even, devide by 2 ;)
@andrewjknott 2 года назад ⁺²
5:24 - cleaner explanation to convert 45 -> 101101 -> 32 + 8 + 4 + 1.
@NoNameAtAll2 2 года назад ⁺¹
that's backward of this algorithm
you did calculation of powers of 2 and multiply them
in the video 101101 -> (((((1)*2+0)*2+1)*2+1)*2+0)*2+1
@JivanPal 2 года назад
@@NoNameAtAll2 Or in postfix notation to avoid all those parentheses: 1 2× 0+ 2× 1+ 2× 1+ 2× 0+ 2× 1+.
@NoNameAtAll2 2 года назад
@@JivanPal why not prefix then?
+ * + * + * + * + * 1 2 0 2 1 2 1 2 0 2 1
:)
@piiumlkj6497 2 года назад ⁺¹
This man is a legend
@estapeluo 2 года назад
Waiting for those follow-up videos!
@anonymousvevo8697 Год назад
Amazing each time a watch your videos
@KX36 2 года назад
nice how you built in the eyes-glazing-over effect into the video so my eyes didn't have to this time like they have in some other videos (because things are complex, not because they're boring) :D
@SillyMakesVids 2 года назад
That's a wicked smart algorithm.
@edwealleans 2 года назад
A cool video and topic but I would like to nit pic the camera placement when Dr Mike writes on his paper. Surely the camera could have been on his left side?
@martixy2 2 года назад
This was a fascinating video even if the algorithm was rather obvious.
@SlimThrull 2 года назад
Huh. I was using a similar but substantially slower method. Good to know it can be improved upon.
@anon_y_mousse 2 года назад
Being half asleep when watching this, for a moment when he was going over all the steps I thought I was nodding off, but nope, it was just the camera. Perhaps get the camera some coffee in the future.
@howtoin2252 2 года назад
Dr. Pound-- @12:56 "I could have worked backwards, right?. If only we could that."
You can! Call it the "subtract and divide" method.
PSEUDO-CODE
k=373
when k is even, k = k//2 and left-insert a 0 into result
when k is odd, k = (k-1)//2 and left-insert a 1 into result
# python code
# assume k is positive. No error error checking included
# longer code shown for readability. != pythonic. Just another way to convert k to binary
k = int(373)
result = []
# result.insert(i, v) # i=index to insert value (v)
print(k)
# printing k excessively to reflect the exact powers shown in video
while k != 1: # checking for 1 is better than 0, but causes the obligatory ultimate left-insert 1.
if k % 2 == 0:
k //= 2
print(k)
result.insert(0, 0)
else:
k -= 1
print(k)
k //= 2
print(k)
result.insert(0, 1)
result.insert(0, 1)
print(result)
@dougfoo 2 года назад
cool trick, didn't realize that relation
i love this series
@Tristoo 2 года назад
the timing/power thing at the end is called a side channel attack
@appropinquo3236 2 года назад
This is really cool! I'm glad that i learned about binary numbers, because I wouldn't have been able to understand any of this otherwise.
@KaneYork 2 года назад
Are you going to make a followup talking about addition chains?
@thatcreole9913 2 года назад
This was fantastic!
@sembutininverse 2 года назад
thank you guys🙏🏻🙏🏻🙏🏻🙏🏻🙏🏻, it was really insightful.
♥️
@QuantumHistorian 2 года назад ⁺³
Is there a proof that that's the fastest decomposition to exponentiate a number? The algorithm here clearly works in all cases in, at worse, 2 ln_2(m) for exponentiating by m, but it's not at all obvious to me that this is the fewest possible number of steps for all m. I vaguely recall hearing a few years ago that in general this was actually still an open problem.
@romajimamulo 2 года назад ⁺³
That's because it's not. For instance, if you knew your exponent was x^y and X was odd, it would be fastest to be taking to the power of X repeatedly.
However, this is useful with computers because they use binary, so finding which powers of 2 make up the exponent is trivial
@pikasnoop6552 2 года назад ⁺⁶
It indeed is still an open problem. The fact that this is not the fastest way can be seen even for n=15. In that case one can compute (x^3)^5 and save a step.
@abdallahegniia1672 Год назад
A comment that i've liked
"This man forgot things about computers more than what i will ever learn"
@christopherg2347 2 года назад
Square is ab it limit Multiplication, while Multiplication is a bit like addition.
Just in the amount it will change the overall result.
@wktodd 2 года назад ⁺¹
Mike Pound - Always good value 8-)
@vikingthedude 2 года назад
So is modulo distributive over multiplication? Is that why we can keep the numbers small?
@jotrockenmitlocken Год назад
Very helpful.
@greatestever2914 2 года назад
yeah! in my assembly class, when we had to program, there is no multiplication operator, so you'd have to shift the bits of the binary value in the registers to actually multiply two numbers. Don't even get me started with bringing in value into registers, and these values can't be stored in just any registers... and then pulling the value out.. storing elsewhere... god bless ..
@hanyanglee9018 2 года назад
The original comment(with 1 mistake and 1 misunderstanding):
23
(23 * 23) mod 747 === 529 (this number is ignored in the final step)
(529 * 529) mod 747 === 463
(463 * 463) mod 747 === 727 (this number is ignored in the final step)
then
400
142
742
25 (this number is ignored in the final step)
625
The final result is (23 + 463 + 400 + 142 + 742 + 625) mod 747 === 154(notice, this line is wrong)
Edit:
According to the 3rd comment in this thread from Jivan Pal, the last line is wrong, and the correct version is multiplication, not addition.
Also, my version is not a in place algorithm. Jivan Pal shows what exactly the method in the video.
@JivanPal 2 года назад ⁺¹
No; this appears to be a common misconception amongst viewers of this video. Mike is / we are *_not_* raising 23 to the powers of 2, and then multiplying together the values that correspond to the bits that are set to 1 in the binary representation of the exponent. That is, we are *_not_* computing 23, 463, 400, 142, etc. and then multiplying them together. That is an alternative algorithm which reads the binary digits from right to left (least significant first) and requires more memory, since you need to store all those intermediate values somewhere until you've got them all. _[Note also that you made a mistake in that final step, summing the values rather than multiplying them; you should've got 131 as the answer, not 154.]_
What we *_are_* doing is reading the binary representation of the exponent _from left to right (most significant bit first),_ doubling a working value (a.k.a. "accumulator" or "acc" for short) with each digit that is read, and subsequently multiplying the acc by the base if that digit happened to be a "1". The acc starts out equal to the base. The working is as follows for the example in the video, i.e base 23, exponent 373 (bitstring "101110101"), with all instances of "=" being congruences modulo 747:
• Read the first (nonzero) digit, which is (of course, being nonzero) "1". Set acc ← base = 23.
• Read the next digit, which is "0". Set acc ← acc² = 23² = 529.
• Read "1":
(a) Set acc ← acc² = 529² = 463.
(b) We read a "1", so multiply by the base: acc ← acc × base = 463 × 23 = 191. *_[This is where square-and-multiply deviates from your method!]_*
• Read "1": acc ← acc² × base = 191² × 23 = 182.
• Read "1": acc ← acc² × base = 182² × 23 = 659.
• Read "0": acc ← acc² = 659² = 274.
• Read "1": acc ← acc² × base = 274² × 23 = 431.
• Read "0": acc ← acc² = 431² = 505.
• Read "1": acc ← acc² × base = 505² × 23 = 131.
The answer is the final acc value: 23^373 mod 747 = 131.
@hanyanglee9018 2 года назад ⁺¹
@@JivanPal Yeah, many thanks. I guess this could be described as in place calculation. Ok, I'm gonna edit my comment.
@ricardoabh3242 2 года назад
Crazy impressive
@FatihKarakurt 2 года назад
Cubes will be easier in this case. 3^3=27 which is (-1) in mod 7. 45 is a 3*15. So the answer is -1 = 6 in mod 7.
@thenewnew1997 2 года назад
Easier for humans, not for computer, please don't forget that computers don't have instinct like us humans, and to write each case scenario is a nightmare, this method allows to be generalized with the complexity of O(2*floor(log2(n))+1) which n is the exponent of the number, which is very efficient in terms of complexity (big O means worst case scenario)
@irwainnornossa4605 2 года назад
Amazing video, I almost want to incorportate it to my program.
@johnsenchak1428 2 года назад
MIND BLOWING !
@harveychallinor367 2 года назад
Regarding the final point about reading a private key from power usage, wouldn't it already need to be on the hacker's computer anyway for this to work? In which, case they already have your private key
@JivanPal 2 года назад ⁺⁴
No; see "side-channel attack". For example, the encryption may be running in a trusted execution environment (like ARM TrustZone or Apple Secure Enclave) and the adversary is running untrusted code outside of that environment that can determine power consumption / voltage levels / timings.
Another example: the adversary is sitting outside your house with a supersensitive probe in the electricity lines that run into your bedroom, where your desktop computer is plugged in, and is able to determine changes in voltage on the lines that way, which correspond to your computer's power consumption.
In both examples, the adversary does _not_ have any access to the trusted environment, but is able to acquire information about the behaviour of that environment which can be reverse-engineered to determine what was actually happening in that environment.
@harveychallinor367 2 года назад
@@JivanPal thanks for the reply, that clears things up
@mathiasplans 2 года назад
From my experience using the right-shift version of this algorithm is much easier to implement because you don't need to take the leading zeros into account.
@JivanPal 2 года назад
Taking the leading zeroes into account doesn't change the implementation The working value / accumulator starts off as 0. Every time you read a digit, you double the accumulator. If that digit happens to be a "1", you then also increment the accumulator. Thus, if a bitstring happens to start with a bunch of "0"s, you just double the accumulator (which will be 0) that many times (which still gives you 0).
The right-to-left version requires that you keep track of the value of g^x mod n for each x from 0 to d, where d is the number of digits / length of the bitstring (binary representation of the exponent), so you need O(d) memory, or equivalently O(log p) memory where p is the power/exponent. By contrast, the left-to-right version only requires that the current accumulator value be kept in memory, i.e. you only need O(1) memory.
*EDIT:* Whoops! In the first paragraph, I mixed up exponentiation / "square and multiply" with converting a bitstring to a decimal number. Yes, you're right that you need to skip over the leading zeroes, but this doesn't make implementation any harder, really: just precede the actual square-and-multiply logic with the equivalent of `while ( (c = getchar()) == '0') ;` in C to skip over them, then `c` will be the first nonzero digit.
@Alecu100 2 года назад
A similar algorithm can be applied for divisions.
@marsovac 2 года назад ⁺²
My eyes hurt after watching the autofocus go berserk :D
@sladebaker9882 2 года назад
8x8 = 64 which is equivalent of 21 x 21 = 441 where you then take 441 x 8 x 8 = 28,224 and then go 28,224 x 8 x 8 = etc and this is how square inflation works
@sladebaker9882 2 года назад
you have 0 as center and outer digit ONLY if you use a 21x21 grid
@oldcowbb 2 года назад
i like the fact that he just watch numberphile casually
@jeremyahagan 2 года назад
Does this represent the smallest number of steps for any given exponent?
@drskelebone 2 года назад
Is there a video about modulo math commuting for both addition and multiplication? I don't remember one, and it seems to be explicitly required here.
@JivanPal 2 года назад
What do you mean by "commuting" here?
@omri.d 2 года назад
12:58 why can't we walk beck? If you have the binery you know the sequans of sq and mol and there is easy revers...
@misterkite 2 года назад
There are a couple of Project Euler questions that this will help solve.
@bwill325 2 года назад
I always wondered how we dealt with such enormous numbers
@ujjawalsinha8968 2 года назад
Intresting, so a different base (instead of 2) can give different performances. For base = 3, will it be called cube, square and multiply algorithm?
@JivanPal 2 года назад
It is true that in base _n,_ you will determine which of _n_ operations to do (e.g. for _n_ = 3, these are multiply, square, or cube). However, cubing is just multiplying a number by itself twice, so _n_ = 3 actually gives *_worse_* performance. In fact, _n_ ≥ 4 also gives worse performance, since what would be squaring twice in the _n_ = 2 case would become multiplying a number by itself four times. Thus, _n_ = 2 gives the best average performance.
Even more generally, square and multiply does not give the best possible performance / shortest possible algorithm for any given exponent, but it is the best we can do without solving a harder problem for the exponent first, which on average would give us much worse performance. It is this: if we know the totient of the modulus _m_ (or equivalently, the order of the multiplicative group { g^x mod m | x ∈ 𝐙 }, where _g_ is the generator of the group, i.e. the base of the modular exponential we're trying to compute, which was 23 in the video), which is denoted φ(m), then we have g^[φ(m)] mod m = 1 (by an extension of Fermat's Little Theorem), and so
g^x mod m
= g^[q φ(m) + r] mod m
= ( g^[φ(m)] ^ q ) g^r mod m
= g^r mod m,
where _q_ and _r_ are the quotient and remainder of _x_ divided by φ(m), respectively. However, computing φ(m) is hard unless the prime factorisation of _m_ is known, so in practice this is not used often. The hardness of this problem is what underpins the security of RSA.
@tomyao7884 2 года назад
I wonder if square and multiply would lose to a naive multiply-only method for a small modulo m and a large exponent x. The naive method would multiply everytime and mod m, then cache the result, quickly generating a repeating pattern of length at most m, and then x mod patternLength would give the place in the pattern which is the answer. So the number of operations is at most m, compared to a square and multiply which can be very expensive for very large x.
@j7ndominica051 2 года назад
When making a really big number, if you can't do the intermediary mod trick, how would the computer handle overflow to another word?
@nebuleon 2 года назад ⁺¹
If you can't do "mod" at any step of the way, you have to allocate enough memory for a multi-word number having, as a number of bits, at least the sum of the number of bits in both multiplicands at every multiplication.
For example, if you're at a point where base^64 is 415 bits, you need at least 830 bits for a square (base^64 x base^64), since both multiplicands are 415 bits and 415 + 415 = 830. Then the multiplication proceeds as usual: on a 64-bit computer, bits 63 to 0 of each multiplicand contribute to partial sums at bits 127 to 0 of the result, and so on, until bits 447 to 384 contribute to partial sums at bits 895 to 768 of the result.
You could always pre-allocate enough bits for the entire number in advance. Then you would execute a modified square and multiply algorithm that just calculates how many bits you're likely to need to hold the final result given all the multiplications involved, allocate that (say it's 16190 bits), and execute the proper square and multiply in multi-word arithmetic on 16190 bits.

Следующие

Автовоспроизведение