Efficient Exponentiation
HTML-код
- Опубликовано: 6 фев 2025
- How many multiplys does it take to compute x^n?
It may be fewer than you think! Worried that calling x ** 15 "slow" is not correct? Don't worry, I tested!
In[4]: timeit.timeit("100000 ** 15")
Out[4]: 0.3686526000000043
In[5]: timeit.timeit("pow_15_minimal(100000)", globals=globals())
Out[5]: 0.3275107000000048
― mCoding with James Murphy (mcoding.io)
Source code: github.com/mCo...
Project Euler: projecteuler.n...
Addition Chains: graal.ens-lyon...
Compiler Explorer: godbolt.org/z/...
SUPPORT ME ⭐
---------------------------------------------------
Patreon: / mcoding
Paypal: www.paypal.com...
Other donations: mcoding.io/donate
BE ACTIVE IN MY COMMUNITY 😄
---------------------------------------------------
Discord: / discord
Github: github.com/mCo...
Reddit: / mcoding
Facebook: / james.mcoding
I've never seen this channel and it appeared in my recommendation within 43 minutes of publishing... So this the start of everything...
Yeah this bloke is good.
Thanks!
Welcome to the grind vibe.
RUclips knows our weaknesses
Same
You should rename this channel to "reality check". You put my programming knowledge to shambles.
Same
I love the feeling of talking to someone who knows much more than I do about something. The greater their knowledge is in comparison to my own, the more I stand to learn from the conversation!
@@mCoding then please do your best to teach us, esteemed mCoding.
i laughed when he said "and im not even going to get into that" as he is giving me the deepest look into such code that ive ever seen.. and to know this man is just scratching the surface humored me. well done man, i see the channel is climbing significantly with every video, you deserve it.
Thanks! It's short for "this video would be 3 hours long, and neither of us wants that". Simple questions often have very deep answers in mathematics.
@@mCoding am i missing something? i'm thinking, naively, that the reason we can assume we're using the rightmost number is that if we weren't, a smaller (or at least not-larger) previously generated chain would have already calculated that and cached it
@@jakemayer2113 The video mentioned that you can't assume this for powers large than ~12 thousand.
Not using the largest power for a given multiplication doesn't mean you don't ever use it. You might need a very large number, and it might happen to be easier to sum 2 very different large numbers (abandoning the largest for a few calculations to get a different number), than to continuously work off the largest.
@@mCoding I'd love to listen to you explaining something for 3 hours, your voice is soothing.
@@jakemayer2113 It’s not always true that you will get using the largest number. @mConding already gave an example. Take 18 for example. We already know (1 2 3 6 12 15). Now for the next step, we can build 18 as 6+12 or 3+15. So, you have two equally efficient answer. As you can see, it's possible to be efficient without taking the largest number 15. You might imagine a scenario, where the largest number is slightly inefficient than taking smaller numbers. It’s rare though and the first time it happens is at 12509. I suggest you lookup Addition Chains in wikipedia which summarizes the paper @mCoding referenced.
8:52 I just finished my Discrete Math Homework on Modular Arithmetics, my last question was 600 ^ 251 mod 1457, so jokes on you
Happy day!
Modular Arithmetics was my favorite math subject! It's amazing how freaky number theory can get compared to other theories when people just think it's just multiplying large numbers together.
Well, joke's on you, that 600 isn't an integer with normal multiplication but rather the name for an element of that finite ring
@@stanleydodds9 we all know that. But you can still call 600 an integer even in this context, informally.
Cryptography uses modular exponentiation for this. It's less efficient than exponentiation by squaring for smaller numbers, but (in comparison) stupidly fast when the base isn't 600 but more like 600-6000 digits.
"slap the sub button an odd number of times" got me good 😂
It doesn't work if you are already subscribed
Seeing a new mCoding video has come out is always a delight. Thank you for what you do!
Glad you enjoy it!
Loved the conclusion on how this trick applies to matrix exponentiation. Simulated markov chains or random walks must benefit so much from this.
Indeed! Thank you for noticing! I never even said the word Markov :)
This channel WILL blow up, your videos are top notch
Yeah we're in on the ground floor of this one
Dynamic Programming on steroids
Thanks for the "compiler explorer" tip! Now I finally understand Duff's device!
You bet!
@@mCoding Excited to see you using the site to teach folks :)
@@MattGodbolt I'm star struck! It's Matt Godbolt himself!
@@mCoding oh don't be silly :) I'm delighted to watch quality content and then go "ooh, I recognise that site!!" :)
This channel is a gold mine for me!!!
I got this in my recommended out of nowhere and I gotta say the algorithm is doing its work! I love this thank you! Subscribed.
I consider myself a expert Python developer. Still I always learn something new about the beautiful language from your videos.
Glad to hear that!
beautiful is debatable but Python has its charms
I thought this video was going to end with binary numbers and repeated squaring... but that's where it started 😂. Great video!
Right. But now, calculate e**π. Or worse, 1.17e**(3.7 - 0.014πi). 😀 Unless a and b are both integers greater than 1, a**b is no-longer quite so simple. (What if they're: Irrational? Complex? Matrices? Tensors? Quaternions? Octonions? The horror. The horror.)
Thanks RUclips algorithm for recommending this, I am hooked. Subbed and looking forward to going through all your other videos
I ran into this problem when I did my implementation of the newton fractal that 3Blue1Brown made two videos on. My naive approach made it so that it took like 20 seconds to compute ONE FRAME of an animation I was doing with not that great error bounds. I quickly
found out that the solution was more along the lines of what you finished this video on, you kinda have to take a different approach to exponentiation, if one is available, which for computing polynomials (of even non-integer powers) there are some already done methods that speed this up significantly. So really this complex problem generally has solutions in many different ways. Also you don't go into complex numbers, which for those fractals I had to take powers of complex numbers, which in its own right also is a big problem to be optimized.
When I need efficient exponents I go with a recursive strategy: if exponent is even return x^floor(exponent/2) * x^floor(exponent/2) , if odd multiply that times x.
Yeah, this is the strategy taken by most libraries (for good reason). The "optimal" way can sometime require much more memory than the way you suggested.
Thumbnails getting better tho
There are of course ways to make a general exponentiation algorithm even more efficient by checking if the number to be exponentiated is a power of two. Then you just use bitwise shift left instead of multiplication.
For example 2
@Gary Williams Let's say we want to calculate 16^5, you can just interpret that as 16*16*16*16*16. Multiplying by 16 is the same as four bitwise shift left.
this channel will definetly grow a lot
Thank you so much 😀
9:00 "Imagine for a moment that x is not an integer but a one million by one million matrix"
Me, knowing about the Jordan normal form: "Okay, so almost exactly like the integer exponent then"
Jokes aside, this was a very educational video! Thank you!
This is a difficult topic, glad you enjoyed it!
You look like a ghost and you always seem vaguely pissed off.
Subbed.
I am glad my pale complexion pleases you. I have zero experience in setting up lights to make me look more lively :)
nice!
a good example that shows that most efficient code doesn't mean the most elegant code :)
It rarely is, see e.g. most C++ code.
I'm bad at using other people's stuff; I either write everything but(or sometimes even including) the standard stuff myself, or I abandon the project. As such, I've written `pow` functions a few times, and this optimisation is just one of these little things that I found myself, and am kinda proud of. This is to naive exponentiation as the way people normally multiply by hand is to repeated addition.
x^15=x^(16-1)=x^16×x^(-1) so 4 squarings and a division or multiplication by inverse modulo if doing modular exponentiation. Finding addition chains that are optimal has been proven NP hard so it's not trivial with large exponents.
I believe division isn't as fast as multiplication, so I'd be careful with that.
Also, if 15*[number] is close to overflowing, but 16*[number] already overflows, you can get vastly incorrect results if you first calculate 16*[number] and then divide (although I believe Python integers can't overflow, they just keep adding bits).
@@hetsmiecht1029 complexity wise multiplication and division are the same. You can always divide with a constant time increase from multiplication like 3 multiplication via Montgomery reduction. 16 here is in the exponent there are no concerns. Of course a big integer library should be used in languages like C if the numbers will be beyond basic language integer sizes based off machine word sizes
The exponent list isn't necessarily unique. In the case of 15, there are 4 minimal chains;
{1, 2, 3, 5, 10, 15}
{1, 2, 3, 6, 9, 15}
{1, 2, 3, 6, 12, 15}
{1, 2, 4, 5, 10, 15}
While working this out, I came up with a much simpler recursive solution. Before I state it, consider the following:
Note that 15 = 5 + 10 and there are two minimal chains there that decompose 15 that way. There are 4 minimal chains for 10;
{1, 2, 3, 5, 10}, {1, 2, 4, 5, 10}, {1, 2, 4, 6, 10}, {1, 2, 4, 8, 10}
and two for 5
{1, 2, 3, 5}, {1, 2, 4, 5}
By taking the outer product of these two lists of chains using set union as multiplication we have (Note: This code is in Mathematica);
MinimalBy[Flatten[Outer[Union,
{{1, 2, 3, 5, 10}, {1, 2, 4, 5, 10}, {1, 2, 4, 6, 10}, {1, 2, 4, 8, 10}},
{{1, 2, 3, 5}, {1, 2, 4, 5}},
1], 1], Length]
which returns {{1, 2, 3, 5, 10}, {1, 2, 4, 5, 10}}, the two entries we had before merely missing our 15. Note that Outer is doing something similar to a list comprehension; use that if you want to port this code to other languages. Using this intuition we can define the following function which iterates over all decompositions of our goal, finding the minimal length chains
MinChains[1] := {{1}}
MinChains[x_] :=
Set[MinChains[x],
Append[#, x] & /@
MinimalBy[Flatten[Table[
Flatten[Outer[Union, MinChains[k], MinChains[x - k], 1], 1],
{k, 1, Floor[x/2]}], 1], Length]]
that "Set" tells the program to store the return value and use it whenever that same argument is called so it doesn't need to be recalculated. The "Append" line adds our argument to the end of all our chains. "Table" enumerates the outer product of the MinChains for every number decomposition for x between 1 and Floor[x/2].
Using this, I was able to calculate in about two seconds that the minimal length chain for 100 is length 9 (example: {1, 2, 4, 8, 16, 32, 64, 96, 100}) and there are 62 such chains. Of course, that outer product is quite costly; I'm sure there are many optimizations that could be made to this, but I think it's a nice starting point.
I'm not sure that I understand correctly what the outer product is, but I'm not convinced that the algorithm is correct. I think that there could be a number n=a+b, that has a minimal chain that uses x^a and x^b, but that isn't based on the minimal chains for a and b.
@@An-ht8so Indeed, there are some missing cases. An explicit example of what you're talking about is the minimal chain for 13;
{1, 2, 3, 5, 8, 13}
which is missed by my algorithm. In this case, the chain for 5 is {1, 2, 3, 5} and the chain for 8 is {1, 2, 3, 5, 8}. In this case, the chain for 8 isn't minimal. I wrote the following much less efficient algorithm to cover these cases;
subsetFil[{}] := {}
subsetFil[{x_, y___}] :=
Flatten[{{x}, Select[{y}, ! SubsetQ[#, x] &]}, 1]
PreMinChains[1] := {{1}}
PreMinChains[x_] :=
Set[PreMinChains[x],
subsetFil@
SortBy[Append[#, x] & /@
Flatten[Table[
Flatten[Outer[Union, PreMinChains[k], PreMinChains[x - k], 1],
1], {k, 1, Floor[x/2]}], 1], Length]]
MinChains2[x_] := MinimalBy[PreMinChains[x], Length]
Unfortunately, it's too memory inefficient to go beyond about 30. None of my tests found anything actually smaller than what my original implementations was able to find, but the number of minimal chains increased by between 0% and 35%, with most not changing at all.
@@XetXetable thank you for the follow up. I don't have a good intuition about whether or not there is optimal values missed for greater n.
This is your best video yet
This is the kind of stuff that makes people think that programming requires math and that programming is really hard. As a fulltime web dev myself this stuff baffles me and even after watching it I don't get it :D
shhhh, He's staring into your soul
I wrote a method for power in ln(n) complexity and hyperpower (I don't remember the exact complexity for hpow). The methode was direct and I did not use the memory (array) to remember the intermediate results. Yes, this method was invented for a faculty project. I did not find any similar results at that time.
As you have proposed an algorithm for multiplying matrices, here I see a potential problem with unnecessary memory usage and longer execution time. You can calculate directly the required values (almost in place)
I think you are mixing up the memory used to compute the addition chains with the memory used at runtime to compute the exponentiation.
@@mCoding You don't need an efficient chain to calculate the pow(x,n) and also you dont need precalculated x2 x4 x8 x16...
For one million x one million matrix you'd maybe use the 2x2 block split that keeps the values in cache. And transpose one of them. I forget and didn't understand the rest, but there was a neat video about it.
Since i don't understand anything after you start defining the first class (i understand what a class is, i just don't understand this language), here is my solution for high exponents (approximative though):
Let's say you search n**p
Step 1: find log_2(n)
Step 2: r = p * log_2(n)
Step 3: multiply r by a constant so it becomes p * ln(n)
Step 4: e**r (can be estimated with sum_(i=1)((r**i)/i!) )
One fun application of this is calculating fibonacci numbers for large n! If you formulate the state as a vector of the two previous numbers then the nth fibonacci number can be found with [1, 1; 1, 0]^n * [1; 0]. Pretty nifty
Theres already a closed form expression for that that's even faster, and it involves the golden ratio!
Find the eigenvalues and eigenvectors of that matrix you refer to and diagonalize it in the form A = CDC^(-1). Now A^n = CDC^(-1)CDC...DC^(-1) = CD^(n)C^(-1). Because D is diagonal you can just take the nth power of each component, multiply this thing with the vector (1,0) and you will actually have a closed form formula for the nth fibonacci number, pretty crazy!
@@dekippiesip where's the golden ratio in that algorithm? I am aware of a closed form solution which raises the golden ratio to the nth power or something I'm pretty sure but that doesn't involve matrices.
I just wanted to highlight a fun usage of the exponent chaining shown here
@@bathtubed the eigenvalues of the matrix are some lineair combination of the golden ratio(forgot the exact form). Just compute it and you will see what I mean, it's fun.
i like the last example about powering a matrix.
except it brings in another question of should i be diagonalizing that matrix first and if so by which method…
It depends on the application, similar to "should i sort my data before searching it?" If you only want to compute a few powers of the matrix, just computing the powers is faster. If you want to compute many many powers, diagonalizing first will be faster.
That's a neat solution. Is it usually better to encapsulate stuff into classes like you have? I usually have a bunch of functions when solving Project Euler's questions
I used a class here just to avoid have global variables everywhere. All the cached stuff and dictionaries would need to be stored somewhere, so it made sense to use a class here. Most problems I would say don't need a class though.
@@mCoding because classes and functions are first class citizens in python, I consider them being accessible to the global scope as being the same as them being themselves global variables.
(the global variable "EfficientExponentiation" stores the )
So what difference does it really make if it's just a static holder like any global collection?
Get the pair of divisors of 15 with the minimum sum then subtract 1 from each:
15 = { 1 * 15 | 3 * 5 }
sum = min { 16 | 8} = 8
Then do the same for 3 and 5. But these are primes so they are final. Then you get 2 + 4 multiplications
Hey, love your work in LCD Soundsystem
Hey, thanks! ....
lol I had the same thought.
Youre gonna blow up. Cant believe u only have 25k subs..
Maybe one day!
@@mCoding you have a very attractive and digestible (and ADDICTING) format to your videos. I also learn a lot. My CS degree aint got shit on the videos you release lol.
Keep it up man. Thanks for the hard work!!
ive been professionally programming in python for 8 years, but I was today years old when I learned that you can do list[-n] to sample the rear of a list.
You can also do [::-1] to get a reversed list. (Might require numpy arrays, not sure)
I wonder how to extend this to allow divisions as well. E. g. x **15 can be computed as x**16/x in 4 multiplications + 1 division.
Interesting idea. It’s unlikely to lead to better performance, though - division is slow.
I'm interested in the semantics of using division as it is merely multiplying the the reciprocal. Seems there may be more efficient steps then, but no clue on if that changes computational complexity.
You're definitely on to something! If you are allowed to use divisions as well you can sometimes do even less total operations. However, divisions are often much more expensive than multiplications, so the problem using only multiplications is the main problem of interest.
I wrote this function years ago. I just rewrote it from memory to post it here.
Note, for some reason, RUclips removes the first and last underscore character in '__name__' below.
There should be two underscores at the start and end of 'name'.
#!/usr/bin/env python
def pow(base, exponent):
""" Raise the base to the integer exponent 'exponent'.
The value of 'base' must be greater than zero and
the value of 'exponent' must be an integer.
"""
value = 1
pwr = base
# Loop over each bit of the base value.
while exponent > 0:
if (exponent & 1) != 0:
value *= pwr
pwr = pwr * pwr
exponent >>= 1
return value
if __name__ == "__main__":
print pow(2, 15)
print pow(3, 9)
It's because RUclips uses a markup language called Textile in which putting an underscore after a whitespace character, and another underscore before a whitespace character, indicates that everything between the underscores should be italicized. Similarly, asterisks make text bold, and dashes cross it out:
'_some text_' => _some text_
'*some text*' => *some text*
'-some text-' => -some text-
@@Ashebrethafe - Thank you.
@@Ashebrethafe Does it escape with '\'? Let's see:
"if \__name__ == '__main__'" => if \__name__ == "__main__"
nope
I think in many cases in actual code where such an "x ^ n" problem comes up, n is actually a known constant (though I guess in those common cases you probably wouldn't have such a big n most of the time), so couldn't the compiler just precompute the minimal multiplication chain for those n's that occur in the code and then use that? Only up to a certain limit for n or time limit of course. For the cases with unknown n it would still use the other algorithm.
Yes, the code he has written is basically for a compiler, you don't do this at runtime as just doing the inefficient number of multiplications is faster for the small exponents and doing the binary representation trick is faster for big exponents.
# is that really faster and/or memory-effective than this:
def pow(digit, power):
n = 1
p = digit
while n < power:
digit *= p
n += 1
return digit
pow(2,15)
I checked your video in python and creating a function that optimized the multiplication. The idea is that to create a function generator one and used that for many calculations to save time. However I quickly found out that python power of operator performs better than my own generated function so it was a huge disappointment for me. But it was fun writing the code!
Dear James.
I really appreciate your approach of asking to slap the Like button odd number of times, but this algorithm has one major issue - if the like is already present, doing so will remove it. Please, change the algorithm to take into account that behaviour.
Hi, could you provide a traceback that led to this issue? I'll create a ticket right away and we'll get this fixed by the next release.
There's some seriously interesting number theory research to be done here. I'm sure there are some mathematical dark arts yet undiscovered that will unlock this problem.
It's an interesting problem indeed. Is there an OEIS of the minimum number of multiplications required for n?
Probably, but if there's not one you could submit it for consideration.
Found it, it's A003313, "Length of shortest addition chain for n"
@@tissuepaper9962 Thanks!
NIIICE! Kudos for attacking a difficult problem with Python!
Thanks!
I have never ventured into the code optimization part of programming, but this was interesting.
What about using logs?
For example, log(10^15) would be just 15*log(10). You only have one multiplication there. Though you would need a good way to calculate logs.... Otherwise it would be the same as the first example.
and then you still need to use exp(15*log(10)) to get the final result, which again is an exponentiation.
@John House no, but if you diagonalise the matrix first, you can convert it into a form where you can just log-multiply-exp each element along the diagonal, then undo the diagonalisation to get the final matrix.
How would you compute the exponent of a non-integer value then? And are there similar tricks or optimisations you can do with this?
Non integer exponents go in a completely different way, typically using a Taylor expansion approximation for exp(x).
One way to do it is to split the non-integer value into its integer and fractional parts and compute the fractional part using Taylor series expansion. Then you can multiply both results (e.g x^12.345= x^(12 + 0.345) = x^12 * x^0.345)
@@mCoding to add to this, before moving to Taylor series theres a more intuitive interpretation that works for rational numbers.
First take a step back, how do we even define something like 10*(1/2)? Obviously we can't add 10 to itself 1/2 times, that doesn't make sense. We know that for integers a, b and c a*(b+c) = a*b + a*c and we want the same rule to hold for fractional multiplication. So we see that 10 = 10*1 = 10*(1/2 + 1/2) = 10*(1/2) + 10*(1/2). Therefore 10*(1/2) is the unique number x such that x*2 = x + x = 10, that means it makes sense to assign the value 10/2 = 5 to that operation.
The same logic holds for exponention, 9^(1/2) is the unique number x such that x^2 = x*x = 9 or in other words, x = 3. So x^(1/2) is just the square root of x, similarly x^(1/n) is just the nth root of x. Generalizing basic rules of exponentiation then implies that x^(m/n) = (x^(1/n))^m, or the mth power of the nth root of x. Now any rational exponent makes sense!
And since the rational numbers are a dense subset of the reals there is only 1 way to extend the function f(x) = a^x to the reals such that f is continious. Now we can make sense of x even for irrational numbers! To evaluate a^x for irrational x just consider a sequence of rational numbers xn converging to x and take the limit of a^xn as n goes to Infinity. The number that limit converges to is defined as a^x for irrational x, and this is the only assignment that makes said function continious.
And there you have it, now you have a continious(even infinitely differentiable) function of wich you can derive a Taylor series and prove that that is exactly the one you would expect, and that the radius of convergence of the series is infinite!
Invoking Taylor series before establishing this is circular reasoning btw, in order to derive the Taylor series expansion we need to know what is meant by non integer exponents in the first place!
For matrices similar logic holds, though it could happen that the matrix equation A*A = B for given B has more than one solution, making B^(1/2) not well defined. B must have certain properties before this even makes sense(Im guessing being invertable, but not sure if that's enough).
Awesome! Subbed. Just one question though... If a good compiler already takes care of this itself, why do we have to do it?
I don't want to speak for James, but I believe he mentions that this was a ProjectEuler question so it's mostly just for practice and deepening your understanding of how computers actually derive complex mathematical equations. if you aren't writing a compiler for a language this probably isn't something that will come up in your career.
Besides the fact that deepening your understanding is always a good thing, compilers do this for integers, but none will do it for matrices or many other objects that you exponentiate. This may very well be an optimization you do to speed up one of your own simulations.
@@mCoding Thank you for the answer!
I love this channel
Tapped on the video
Thought what it could be
What it is. None is them are the same anymore.
Prime number partioning can be helpful to calculate this? Isn't that the first obvious solution to move ahead with?
I want to say this method works exactly the same whenever * is an associative operation, regardless of what * means. But calling its repetition as exponentiation in all of those cases seems quite ok
Indeed, this works for any associative operation!
Well it's a bit odd to call it exponentiation when * is addition!
Very nice video!
In an interpreted language, how much sense do these kind of optimizations make, versus calling a built-in function? I have tried to optimize stuff like this in Octave, and some times I found that what appears to be a rational choice ends up being slower because defining new variables takes more time due to memory allocation.
In interpreted languages, you will almost never be able to beat a builtin function in any situation, and I nearly always recommend using a builtin over writing it yourself. The fact that my pow_15_minimal beats x ** 15 is totally unexpected.
@@mCoding does python have something like pow(X,15)? Maybe ** is working at the interpreted level?
Compiler writers are geniuses.
You can't get any faster than log2(n). And binary algorithm already has 2log2(n), what means that you can just wait just 2x more time rather than calculate the most efficient exponentiation. It's important, because your algorithm has e^n complexity. And that means for bigger numbers it's faster just to calculate with binary algorithm
I think that you are mixing up the runtime of the exponentiation (which is log2(n) for the optimal chain as well as for the binary method) with the runtime it takes to compute the order in which to do the multiplications (which is > exp(n)). You are not meant to calculate the numbers for efficient exponentiation at the runtime of a program that needs fast exponentiation, this is an exponential time algorithm as you mentioned. You only need to calculate the optimal chain for each power ONCE for all time, it never changes, regardless of the input numbers or even datatypes. If we were writing a program where we knew that raising things to the 15th power was extremely common, we could calculate that (1 2 3 6 12 15) is optimal for 15 once, before runtime, and then implement the pow_15_minimal function and replace it for any places we had ** 15. When we run our program we will then see the performance gain of pow_15_minimal over ** 15, but we will not see the cost of computing (1 2 3 6 12 15) as this was precomputed. In general, you can download a table of these optimal chains so you don't ever have to compute them yourself, they are just available for you to use if you wish. In Python this is not something you would probably end up doing unless you were using big matrices.
Does there end up being a cutoff time where if power < cutoff, then the binary power algorithm is faster? The general approach seems to use a quite a lot of object storage, which for math computations is usually not the best.
Yeah, storage is most likely the reason the compilers use 6 multiplications rather than 5 for pow_15.
great video! many thanks
I get an error using the code you have here typed in manually when I try to execute a cell containing the class EfficientExponentiation (and nothing else). I get the same error if I use the code from your github in a cell. I'm using Python 3.7.10 in Jupyter notebook. The output from your code is
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
1 from typing import Optional
2
----> 3 class EfficientExponentiation:
4
5 def __init__(self, step_limit: Optional[int] = None):
in EfficientExponentiation()
30 self._compute_next_step()
31
---> 32 def minimal_multiplication_chain(self, n: int) -> tuple[int, ...]:
33 self._step_until_n_computed(n)
34 return self._min_mult_chain[n]
TypeError: 'type' object is not subscriptable
---------------------------------------------------------------------------
If I change
def minimal_multiplication_chain(self, n: int) -> tuple[int, ...]:
to
def minimal_multiplication_chain(self, n: int) -> None:
it runs and returns the same value as if I had typed
x = EfficientExponentiation(step_limit=50)
x._step_until_n_computed(15)
x._min_mult_chain
It acts like the type hint module is broken. Note that tuple works fine on the earlier statements where tuple is not the output type, but an input type.
There is a long rant here that seems to be the same issue: stackoverflow.com/questions/62871524/is-pep-585-unusable-at-runtime-under-python-3-7-and-3-8
By adding at the top
from typing import *
and then changing
def minimal_multiplication_chain(self, n: int) -> tuple[int, ...]
to
def minimal_multiplication_chain(self, n: int) -> Tuple[int, ...]
that is, capitalizing Tuple, it works right.
It's certainly not your fault that the Python typing format you used is broken for 3.7 and 3.8, but this was annoying to find and fix, and probably discouraging for noobs trying to run your code. The form of typing you are using originates in PEP 585, which only says Python 3.9. I don't know if it's in future or not, nor how to drag it into 3.7/3.8 if even if it is.
Oh, PS: I don't want to discourage _you_. You're doing a great job teaching in general - good examples, clean code, etc. Keep up the good work. I learned a lot about typing tracking this down; unfortunately, I'm stuck at 3.7 until Anaconda catches up.
loving these videos keep em coming :D
Glad you like them!
I didn't completely follow the process, but basically they have a logarithm table in the computer but for exponentiation?
You said the best time complexity was exponential, this is false unless n^(1/2) counts as exponential which i think not. I know of a logN solution but its not the exact least mult, very easy to program. I think I can get the ans in exact least steps in sqrt(n) + sqrt(n/2) but its a little complicated and I'm to lazy to program and verify if it is true.
I think the claim is that current algorithms for finding the minimum length addition chain for a number k takes exponential time. See the Wikipedia article on Addition Chain, in particular, the "Methods for computing addition chains" section.
Amazing. Can you make a video on how to time programs? Some people say use perf_counter or default_timer and it's all very confusing to me
If you want to know how to use time, here is a simple video - ruclips.net/video/By3bfIpTZuo/видео.html
It's cool that for numbers that are in the Fibonacci sequence, the chain is equal to the Fibonacci sequence (see 8:20 for number 13).
Is this a general pattern? Very interesting!
@@mCoding This doesn't seem to hold all the time (e.g. for 8, the best chain is (1,2,4,8) whereas using the fib sequence would yield a chain that's one longer: (1,2,3,5,8)). I suspect that's true for many fibonnaci numbers though since you are always the largest two numbers in the chain and the only faster way to possibly get to your target number faster is to add the last number of the chain to itself (which is what is happening in the case of 8).
A perhaps simple question: how do you get that nicely formatted exit code at the bottom/end of the module when you run it? I use VSCode for Mac
why not this?
def pow(x, y):
acc = x
for i in range(y): acc *= x
return x
Why not:
Def pow_15(x):
X2 = x*x
X4 = x2*x2
X8 = x4*x4
X16 = x8 * x8
Return x16 / x?
And then you could do a smart while loop that knows to divide when needed
Assuming you mean x16 // x of course (int division), this is certainly an option. It's still 5 total math operations, so on par with the minimal multiplication method. Typically division is slower than multiplication at a hardware level so this is almost never done in practice, but there may yet be a niche use case for it in the future!
Here are my timing tests:
timeit.timeit("pow_15_minimal(100000)", globals=globals())
Out[7]: 0.6788639999999901
timeit.timeit("pow_15_minimal(100000)", globals=globals())
Out[8]: 0.6870734000000027
timeit.timeit("pow_15(100000)", globals=globals())
Out[9]: 0.8700359000000049
timeit.timeit("pow_15(100000)", globals=globals())
Out[10]: 0.809845100000004
timeit.timeit("100000 ** 15", globals=globals())
Out[11]: 0.6612468999999948
timeit.timeit("100000 ** 15", globals=globals())
Out[12]: 0.7574889000000127
@@mCoding Yeah that makes sense.
Thanks!
what will be output @mcoding
def foo(a):
a.extend([4,5,6])
a = [1,2,3]
foo(a)
print(a)
and why is that
other example:
def foo(a):
a['foo'] = "bar"
return a
a = {}
foo(a)
print(a)
def foo(a):
a = 5
return a
a = 0
foo(a)
print(a)
The problem is more complicated because the processor can multiply in parallel independent things. Knowing that, ¿is there a paralell multiplication form more effective, than this chains?
These gives are blowing my mind🔥🔥🔥
I always wondered who are those people who have those weird hacky solutions in codewars. Now, I am looking at one of them.
If it works, it works!
If I was concerned about this level of optimisation in an interpreted scripting language (as python is) I'd use a compiled language instead. Scripting is great for managing a process and for human-speed interfaces but not for the compute intensive stages.
I think you are missing the point here. Python is only used to compute the optimal addition chains, which only have to be computed once and never again. Once they are computed, you can use them in whatever language you want. You could use them in your implementation of a C compiler to produce faster C code.
@@mCoding I watch your videos because you actually do what I was trained to do, which is analyse and optimise every line of code of every algorithm. I missed the point that this was a one-off precomputing of a lookup table.
But consider that the table will be stored in memory and it then becomes a question of whether the speed of memory access, even for compiled languages, can compete with the compute speed of a dedicated math unit on chip.
Dedicated, tuned and compiled math libraries are essential for all languages, especially scripting languages like python.
My reaction to your video came from the perspective of seeing, for example, professional java or python or PL/SQL developers implement every aspect of a process within the language they specialise in, often at great cost to the process and the enterprise when they should be selecting the tools that are appropriate. Their reasoning will often be they are just prototyping but the temporary always becomes the permanent as no-one wants to redo work from scratch. They want to move on to more interesting work.
I've worked in environments riddled with processes built this way and have seen them, not just clog servers but clog entire IT departments with constant resource issues with efforts to find workarounds. Because fixing the root cause at that stage means major reworking of projects that are now in production.
6:40 "if a key already exists it does nothing"... not actually true, it returns the value for that key.
Next time im in an interview im turning the tables on em
Woah there, be nice to your interviewers!
I can do it in one mutiplication:
1) Take the log of your number
2) Multiply that log times the number you are exponentiating by
3) Take the inverse log of the final number
Example:
X^15 = inverse log(15 * log(X))
But how many multiplications does computing the log of your number take?
@@mCoding it would require a lookup table, or possibly some conversion to a log(2) using binary expansion. I haven’t gone that far but I’ll admit the gains could be lost in the implementation.
I also have a hunch that exponentiation libraries might use something like this trick under the hood… but this method is how humans would do this before computers, using lookups into log tables
It's significantly simpler to use the binary form with repeated squaring and division by 2, also, I don't see a case where it would be slower.
See the other comment about this, division is significantly slower than multiplication on most hardware, and speed testing confirms that pow_15_minimal is faster than the builtin ** 15, which is faster than repeated squaring then division.
@@mCoding well, since it's a divide by 2, you can just shift right.
@@gideonmaxmerling204 nuh-uh, you're dividing by the base
@@dielaughing73 what do you mean?
dividing by two and shifting right are equivalent.
Could you get away with 4 multiplications for the X**15 example?
X*X=X**2,
X**2 * X = X**3,
X**3 *X**2= X**5,
X**3*X**5 = X**15
It makes me happy to see that viewers are trying these things for themselves! Nice try but small mistake! X**3 * x**5 == x**8, not x**15!
@@mCodingI forgot how this works, cheers
I was hoping for some "evil bit hacks" 😂
You should look into Lucas numbers and parititioning
Just wondering, is it wrong to do x15 by getting x5 and multiplying it by x3? Getting to x15 in only 4 multiplications
Multiplying 2 powers of x will add their exponents.
x5 * x3 = x8, because it's basically x*x*x*x*x * x*x*x.
Thanks CatGamer for explaining!
Another solution:
x2 = x * x
x3 = x2 * x
x5 = x3 * x2
x10 = x5 * x5
x15 = x10 * 5
5 multiplications... The problem is that python is not assembly and this function you made will allocate a lot memory, which takes time, also it's only power for 15. You'd needed a lot memory just for power function with any variable. This program is efficent only when you do it in assembly and if you need only power of 15.
It will allocate more memory, but I definitely would not call it "a lot" of memory. If you are worried about all the temporaries, that was just for ease of the viewer, you can compute it using only a single temporary integer as follows:
def pow_15_minimal_fewest_temp_vars(x):
# (1 2 3 6 12 15)
y = x * x # x2
x *= y # x3
y = x * x # x6
y *= y # x12
x *= y # x15
return x
Now the extra memory is at most the memory of a single python int. (Note the binary assembly method also requires a temp int).
Copied the code directly and ran it:
Traceback (most recent call last):
File "exponentiation_efficient.py", line 9, in
class EfficientExponentiation:
File "exponentiation_efficient.py", line 38, in EfficientExponentiation
def minimal_multiplication_chain(self, n: int) -> tuple[int, ...]:
TypeError: 'type' object is not subscriptable
Using Python 3.6.9
The code requires 3.9 because I used subscripting of builtin types.
@@mCoding Thanks! Works now.
hi i have a doubt, what mean [x,...]; min 4:29
The notation tuple[int, ...] is the way to type hint that this variable will be a tuple of variable length, but that it will contain only integers. For tuples of fixed length, say 3, I would just write tuple[int, int, int], but in this problem we consider tuples of longer and longer length, so this is why I use the variable length notation. In many situations, when the length is not known you could just use list[int] instead, but here we are storing the elements in a set _cached_chains, so tuples must be used instead of lists because lists are not hashable. Putting is all together, dict[int, tuple[int, ...]] is a dictionary whose keys are ints and whose values are variable length tuples of ints.
@@mCoding ooh thanks :)
4:52 w-what are you doing, step-variable?
Great video
Tony Hawks Pro Python 3
I can hear the music in this comment.
Very interesting, thanks for sharing
My pleasure!
You know youre a programmer if your eyes hurt at 2:50
This seems like an instance of Dynamic Programming
Generally this is disputed by noting that an addition chain can be composed of chains for smaller numbers that are not optimal. The way to correct this would be to have subproblems consisting of sets of numbers that have to all be computed. Problem is the state to memoize then explodes.
Can you create content about all design patterns?
Design & program structure are difficult topics to teach, it may take a while.
@@mCoding it’s ok, I’ll be happy to wait your content
@@izzanzahrial7345 d&p can't be taught in a couple of videos. The book "design patterns: events of object oriented reusable code" (more or less) is a good starting point.
I have a sneaky suspicion that compiler writers have an additional problem to deal with here, it's not just efficient exponentiation, it's precision. I'd bet the mess that is floating point math rears its ugly head, you likely have to sanity check that the approach you're taking gets the same result (likely only possible with constants, and likely compared to the for/while loop approach).
Indeed! Integer multiplication is associative so there is no issue, but floating multiplication is not, so compilers will not do this optimization unless you pass a specific flag to allow it for floats. This optimization will work with any multiplication as long as it is associative.
make a data structures and algorithms series pls
Just like dynamic programming
Wow a new useful chanel
Glad you think so!
Cool video , vote up :)