If you have complex operations in a list comprehension it is almost always better to extract them into a function and then do something like [ f(n) for n in range(100) ]. This makes it clear that the list isn't being constructed by a recursive process and the reader can safely understand the function f independently of the range of values it is taking as input.
@@Suntoria236run “import this” in Python. “Simple is better than complex”. Even if the complicated way is a little bit more efficient, a solution that is easy to understand is way more valuable
Efficient python is just python with proper use of external libraries. The most important part is still readability. Simply put, can you understand and work on this code 2 years from now and taken out of context
Trying to "microoptimize" python is pointless. If performance of lists of numbers was a concern, you would be using something like numpy anyway. The performance difference between a list comprehension and a for loop is never useful in python.
What makes this video great and pleasant to watch is that your presentation is not dogmatic but analytic. There are some python programmers out there arguing you should avoid using for loops because they are "not pythonic" and "not good for readibility". But you a made pretty clear case why in python list comprehension is better than for loop, and provided a balanced view in terms of readibility.
I'm a little confused - if the main time savings arise from not having to load, precall, and call the append() function, why are the gradients of the list comprehension and for-loops so similar? It looks as though list comprehension in your example has a pretty constant time advantage. Would loading, precalling, then calling in every iteration not imply that the longer the list, the greater the time saving?
No, it doesn't "load" list like you think it does. The list is always in the memory, python just loads "header" part of the list, which is of constant size.
I agree with @williamjedig7480. Since the lines are almost parallel, the overcost of the for loop method can't be caused by something in the for loop. It looks like there is no link between the graph and the disassembled code.
@@koktszfung You are right, his data fails to show it, I guess it's trash, i.e. not enough data points or too much noise. But correctly done, You clearly see the different slopes, and my explanation of the result you would see if you did it yourself is correct.
You should also keep in mind that range is a generator and depending on its use the comprehension will return a generator too instead of the list. Depending on the scale (when you don’t need to keep all data in memory, when you’re working with infinite generators etc), the comprehension may make things even more useful than just faster.
List comprehensions (in square brackets) ALWAYS return lists, even if the code inside them iterates over a range. Generator comprehensions (a parentheses) return generators.
Since often times list comprehensions in python are used to filter or map elements from a list, how does the speed compare when using a list comprehension compared to using the map/filter methods?
Interesting question. I think the biggest difference is that map, filter and zip are generators, so they run only when the next element needed, but no more. So if you don't need to accumulate intermediate results and you have a long chain of filtermap-like operations, it may be better to use generators at first and then iterate them. However I like to use functools.partial for lazy execution rather than generators, because they work in much more intuitive and explicit way (for example, you can't iterate through the same generator twice which can cause a bug if you're not careful, but you can use a list of partials as many times as you want).
You can just use a generator expression. Haven't checked with the new optimizations. But in the past basically comprehensions were always faster because c and optimizations basically.
Agree, numpy or pandas are faster, but add extra learning curves. This video is a training. Computation on a big list of number, if needed to be stored in memory, I would do it with numpy. Otherwise I will prefer a generator.
Thanks, I'm just a Python beginner but I find these videos what's happening in the background very interesting. Many times I wonder why it is doing this or that and the answer is some call stack or explanation like this.
How would the time change if instead of using append, which has to sometimes reallocate another buffer, you just did something like my_array = n*[None] and then my_array[i] = x
I'd love to see if there were any difference if the list in the for loop example was preallocated. Usually, when increasing the size of a list eventually the lists memory has to be reallocated to a larger block, which of course takes some time. If the list was preallocated this step would be faster
Yeah, I'd be interested to know how a more typical optimized approach compares. In C, for instance, you'd allocate the array once with its known size, then loop to assign the values, which is simple memory assignment to a known address with no overhead of function calls. I wonder if looping through the list in python with element assignments would have similar or even better performance than the comprehension version, which still is going to run into reallocating memory.
So I went ahead and tested this theory. The exact sample functions in this video, plus a "preallocated" version which is initiated using a for loop. I evaluated for various values of N, stepped by 50K all the way up to lists of size 10M, 10 averaged samples per function. Results: ~0.1067 seconds per 1000000 elements for the for-append loop ~0.0800 seconds... for the list comprehension (~25% faster) ~0.0661 seconds... for the preallocated for-assign loop (additional ~17% faster) All of these scale very linearly with the size of list. So the C-style version gives pretty significant additional time savings, at least for this very simplistic task. P.S. The graph shown in the video is pretty nonsensical. If anything, it looks like it's showing a shallower coefficient for the for-append loop. Also, I timed things with time.process_time(). The video shows him using perf_counter(), which isn't great for showing code efficiency because it's the total elapsed runtime including any time the process spent sleeping.
Do you recommend using a for-loop instead of a list-comprehension for more complex tasks just for the sake of readability or is there some trade-off at some point?
Nice video, thanks for that! Where can I look for "best practices" or how to write more efficient code? I'm learning about these topics in Pyhton now, but it's very hard to find libraries and content...
My perspective is that if you have to sacrifice readability for performance in python you are generally using the wrong language for the problem. Need a language that is easy to learn, get started with and make to make small to medium sized prototypes in? Then python is a good choice for a lot of people. Need a language that is fast and flexible? Then go with something like c, c++ or rust. If its just some small part you can make bindings. Even fairly simple c++ code can sometimes be thousands of times faster that the equivalent python code.
I have one friend who insists on using list comp for everything even if it makes the code nearly unreadable in the long run. He's not dealing with large enough lists for list comprehension to really matter speed wise and just does it cause he can
I totally love the video but the music is driving me crazy 😂. Of course, to each his own, the rest might like it, I just would like to have the option to listen to my own music or to go for silence whenever I feel! 😇 Anyway, thanks for the video, brilliant stuff. 👍
Great video, and awesome animations! To further prove the point in the video one could do the following: def for_loop_preloaded(n): my_list = [] # pre-load a reference to the append method as to avoid the "LOAD_METHOD" within the for loop append_method = my_list.append for x in range(n): append_method(x) return my_list Testing this we can see that it's quite faster than the "for_loop" function but still slightly slower than list comprehensions!
That's because it still needs to perform a lookup for the "append_method" variable each time, unlike a list comprehension that creates and uses an anonymous list (which is typically stored in a variable and/or used as a function argument after the list is complete).
It's very simple: the difference is because of append. If you would compare the for loop using something like "result = [0] * size_needed" and then enter the for loop and just index each result instead of appending, the performance would be very similar.
I tend to write list comprehension within list comprehensions. Also, fun fact, there are only dictionary, set and list comprehensions in Python and they all can have comprehensions nested within each other each other. The one mystery that I want explained is the following example: num1 = 1 num2 = num1 What on Earth is the point of doing that? As far as I can tell, the only thing it does is make num1 and num2 point to the same object making num2 redundant. I get annoyed whenever I come across this in someone else's code.
I'm surprised that the append method isn't just inlined to prevent the need for a method call. I assume that append does something similar to just the LIST_APPEND op code, so such a small method should be optimized. Java is similarly high level, with compiling to byte code and running on a virtual machine (the JVM), but it optimizes method calls vs inlining automatically during compilation, based on the method's complexity. In a couple of other languages that run on the JVM, you can even specify explicitly if you want to inline a method or not (with some added benefits surrounding generics)
`list.append` is implemented in C. I don't see how this can be inlined. Also, it's difficult to prove that `my_list` is actually of type `list`, even though we declare it as such above. That's partly because Python doesn't have a static type system contrary to Java and `my_list` could get modified through, for example, threads.
Java is not similarly high level, it gives you way more control. Most notably, java is statically typed, Python isn't. The compiler can't know that the variable is actually a list, so it ain't be inlined. If you want stuff like that, look at Cython.
@@megaing1322 "Java is not similarly high level, it gives you way more control." it is just as high-level if not more. Having more control doesn't make it any less highlevel. Being incredibly slow and without checks is NOT a trade of highlevel-languages.
@@ABaumstumpf higher level = further from the actual CPU. In Java, you have direct accesses to varies low level types like different int sizes, a choice between float and double and arrays are exposed way more directly than in python. Yes Python is higher level than Java. That doesn't say anything about the quality of either. But are ofcourse still high level language, anything at a level of C or above is.
@@megaing1322 "In Java, you have direct accesses to varies low level types like different int sizes, a choice between float and double and arrays are exposed way more directly than in python. Yes Python is higher level than Java." No, that shows that python is a dynamically typed language. and that was a thing even 50 years ago in some low-level languages. The concept of high/low-level only makes sense when talking about which operations and idioms a language supports. Python does not support low-level programming, Java also not really, C++ does. python does support high-level abstractions, so does Java, so does C++. And no, not "anything above C" cause C is also high-level depending on the environment and what you are doing. Not as highlevel as many modern languages of course, but you are no longer restricted to bare-to-the-metal code. And btw: java has arbitrary precision numbers, and Python has floats (which usually are just C doubles) and complex (2 "floats"), and until Python3 it also had 2 integral-types. having more options available does not "make" a language lower-level.
What is being optimized is more when comprehension are being called a lot. An individual comprehension's iteration (which is more or less what is being tested here) wont change that much.
It's always funny how whenever someone explain why X is faster then Y in python, it always come down to the actual C implementation of X and Y. It's like python is just a C wrapper (I use rust, btw)
@@megaing1322 yup, this is technically correct. Even though i want to see anyone able to write rust like code in assembly lol (well i guess python would be the same, as i guess the underlining C code is hard af)
Sometimes it isn't possible to change language for various reasons and Python may be forced on you and your team, but it doesn't mean that performance never matters at all.
I feel the same way. I primarily use C++. I've done it for 30+ years. I don't understand why the industry has coughed up a language that does for the most part the exact same thing as C++ (Loops, arrays, variables, classes) just a lot lot slower... Why drive a Pinto when you can drive a Ferrari? Supposedly C++ is difficult... I don't get why people feel that way. I picked C up when I was 15ish by reading a book. I started picking up C++ a couple years later. This was long before RUclips, Google, StackOverflow, Udemy, LeetCode. Just about every language boils down a few basic concepts. 1) Sequential execution of instructions. 2) Loops or conditional branching (JMP, JLE, GOTO, IF, WHILE, FOR, DO , etc.. it's all variants of the same concept) 3) Storage in Memory, for example variables, (a , b, count, i, j, numDaysInMonth, numGoalsScored, x, y, z, etc..) 4) Storage in Multiple chunks (bytes) of memory (malloc, alloc, new , free, delete, smartpointers, std::vector, lists, dictionaries, tuples, etc...) Now OO languages have things like classes, inheritance, polymorphism, but that stuff isn't overly complicated. Python has classes. Most of the time when I write Python, I use classes, I'm just used to it. It's all the same crap. I've programmed in LOGO, GW-BASIC, ASM, C, C++, C#, Visual Basic, Perl, Python and probably a few I'm forgetting. I stayed away from Java thankfully. All of these language boiled down to the same core concepts listed above, no matter what problem you were trying to solve. So... if you're going to spend a buttload of time writing a bunch of code, why not do it in a language that runs 30x faster?? Honestly, If you have a language that has 3-4 different ways to do a for loop and they all have different performance characteristics that warrant making RUclips videos about it. Well I think that's just a fundamental problem with the language itself and is something that really never should have ended up that way.
I think that it might have to do with array resizing. I do not know so much about python but in C/C++ it takes a lot longer to create a new array an refill it than creating an oversized one and just adding to the corresponding index.
No, that doesn't happen here. It would in theory be possible, but as you can see from the bytecode sequence, none of those preallocate elements. `list.append` ofcourse is smart enough to correctly scale preallocation to make append O(1), but that is not a difference between the two versions. However, this would happen for example if you call `list(range(n))` instead.
@@megaing1322 Yes, the cost here is the function call. Most compilers will inline the function call in contexts like this. I know in C++ most compilers would inline this function when used in a for loop, especially if the function is a template. It's a little strange the Python compiler chooses not to inline this. But, then again, usually these JIT type scenarios do very little optimizations. And I imagine the dynamic typing of Python might force this to not be inlined
@@lucass8119 You clearly have no idea how python works. It isn't possible for the compiler itself to inline list.append. And CPython (which is what is being talked about) currently has no JIT compiler.
@@lucass8119 " the cost here is the function call." At least the data from this video would heavily imply - No, that is not correct and not the source of the differing performance: Here the 2 lines are almost parallel with a constant offset (the for-loop even slowly catching up). That can not be the result of an overhead that is incurred repeatedly
@@ABaumstumpf The overhead isn't incurred repeatedly, its a constant time factor. Therefore, the two lines being parallel makes perfect sense. Its not like the second function call is more expensive than the first and so on. Each are equally expensive (theoretically) so the lines should be perfectly parallel, with the one with a function call being slightly slower. Both are O(n), they should be parallel. We know it has to be the function call, because look at the disassembly. That's the only difference.
Is it a toxic workplace if your boss tells everyone to make the codes faster even if we sacrifice readability? Our layoffs have calm down and we do printout documentations of our system but I worry for the future employees and want to ask if I should step up and say that we should not sacrifice readability for speed.
My view here, on readability, list comprehension doesn’t make code unreadable, if people are used to code in python. It’s event sometimes more simple. In fact, I will say it depends on the complexity of the computation. I would say do not use list comprehension if there are side effects.
@@olivierbouchez9150 I'm talking about it in a more general sense. I really appreciate your perspective on it. Thanks and have an awesome day/afternoon/night. :)
@@aizenvermillion434 Code speed isn't opposite of readability. Readability is an issue of either telling the truth about the use of your variables/functions, or not. Opposite of code speed is development speed. More you optimize, more difficult it will be to do small changes to the program, and vice versa. Development is about making the best product (code speed) or improving on that product (development speed ) If your job requires you to do nothing else than to write a fast program, you shouldn't even care about readability. But if you need to do any amount of bugfixing or testing, you require a balance between the speed of the executable and the modularity of the code.
@@benshulz4179 Thanks for the reply. An update on that. We got a new lead programmer on our team and he explained to the bosses better than we could. We got to code better than before at a good pace during projects now. We finally had the time to test them out and let the apps leave without errors unlike with our previous lead and manager. We wanted to be given time to code properly and give quality products but could not do so under the previous guys because they'd push us to finish as fast as possible and when complaints came we'd get the blame while the previous lead programmer dunks on us further. Sorry ranting in the end. I'm glad we got a new lead programmer that actually leads our team.
so its the append which fucks up stuff. i need to check out if using variable 'i' and not using any variable '_' makes a difference when working with large numbers. let me know your finding too brother!
@@koktszfung Sure, you might know the size beforehand, but Python's list comprehension don't use that information. (for that, use `list(range(n))` instead)
@@megaing1322 It’s not possible to define strict size, but you can avoid list resizing (which by the way all in all takes O(n)) by simply writing arr = [0] * n. By default python list has size of 10.
@@ГлебГолубев-ч7щ Yes, list resizing all in all takes O(n), each individual append however is O(1) (see amortized cost). So by prefilling the list you gain nothing in time complexity, and I am not sure if you gain a measurable difference is actual performance.
I personally disliked it because the title didn't say video was looking specifically at python implementation. I clicked it for algorithmic explanation
It would be interesting to compare performance between list comprehensions and generators. [x**2 for x in range(n)] compare to (x**2 for x in range(n)) of course there is a difference in the moment the values are computed. Generator could be the best choice to avoid the list loaded in memory, but values available on need. Switching list to generator is a way to make code efficient.
If you have complex operations in a list comprehension it is almost always better to extract them into a function and then do something like [ f(n) for n in range(100) ]. This makes it clear that the list isn't being constructed by a recursive process and the reader can safely understand the function f independently of the range of values it is taking as input.
Oh damn, now I’m beating myself over not using this way earlier. I’ve written some horrendously complicated list-comprehensions before…
@@Suntoria236run “import this” in Python. “Simple is better than complex”. Even if the complicated way is a little bit more efficient, a solution that is easy to understand is way more valuable
Or use a lambda function in the list comp
@@suhailmall98 this problem is specifically about not doing that to improve readability of list comprehension
But what am I going to do with my seven line listcomp?!
This was one of the most unique python videos I've watched on yt. It's the first time I've looked under the hood of a python code. Thanks!
My favorite optimization in python is rewriting my program in C++
And if your original program uses numpy, rewriting it in C++ will probably slow it down.
@@cholling1lmao this sent me laughing
@@cholling1I know numpy is quite fast but is it faster than code in C++?
Ofc you're gonna write a backend with C++
@@PongsiriHuang Depends how good you are at C++
You should do more vids on writing efficient code! I think youtube lacks this type of programming content
Efficient python is just python with proper use of external libraries. The most important part is still readability.
Simply put, can you understand and work on this code 2 years from now and taken out of context
@@harrytsang1501very true, use libraries written in C like numpy, and properly follow pep and you'll be fine
Trying to "microoptimize" python is pointless. If performance of lists of numbers was a concern, you would be using something like numpy anyway.
The performance difference between a list comprehension and a for loop is never useful in python.
What makes this video great and pleasant to watch is that your presentation is not dogmatic but analytic. There are some python programmers out there arguing you should avoid using for loops because they are "not pythonic" and "not good for readibility". But you a made pretty clear case why in python list comprehension is better than for loop, and provided a balanced view in terms of readibility.
This was amazing, thank you for the in depth answer!
Your videos are very well crafted. Keep up the good work!
I'm a little confused - if the main time savings arise from not having to load, precall, and call the append() function, why are the gradients of the list comprehension and for-loops so similar? It looks as though list comprehension in your example has a pretty constant time advantage. Would loading, precalling, then calling in every iteration not imply that the longer the list, the greater the time saving?
No, it doesn't "load" list like you think it does. The list is always in the memory, python just loads "header" part of the list, which is of constant size.
Both are O(n) in total, just the scaling factor C is different. Each iteration takes less time, but this time is constant whether n is 10 or 10000.
I agree with @williamjedig7480. Since the lines are almost parallel, the overcost of the for loop method can't be caused by something in the for loop. It looks like there is no link between the graph and the disassembled code.
@@megaing1322if it is linearly scaling differently with n, then the slope would be different. Here, it is some random shift vertically.
@@koktszfung You are right, his data fails to show it, I guess it's trash, i.e. not enough data points or too much noise. But correctly done, You clearly see the different slopes, and my explanation of the result you would see if you did it yourself is correct.
Top explanation! Many thanks for such high quality and easy to understand content! 🙏
You should also keep in mind that range is a generator and depending on its use the comprehension will return a generator too instead of the list.
Depending on the scale (when you don’t need to keep all data in memory, when you’re working with infinite generators etc), the comprehension may make things even more useful than just faster.
List comprehensions (in square brackets) ALWAYS return lists, even if the code inside them iterates over a range. Generator comprehensions (a parentheses) return generators.
generator comps are so sick. i use them all the time with pytest
"Now, the interpreter has 3 parts. 1) the compiler..." (1:50)
Me: Hold up!! I call foul.
This is such an interesting video! I had no idea about Python's `dis` library but I genuinely think I'm going to use it a lot now!
Since often times list comprehensions in python are used to filter or map elements from a list, how does the speed compare when using a list comprehension compared to using the map/filter methods?
Interesting question. I think the biggest difference is that map, filter and zip are generators, so they run only when the next element needed, but no more. So if you don't need to accumulate intermediate results and you have a long chain of filtermap-like operations, it may be better to use generators at first and then iterate them. However I like to use functools.partial for lazy execution rather than generators, because they work in much more intuitive and explicit way (for example, you can't iterate through the same generator twice which can cause a bug if you're not careful, but you can use a list of partials as many times as you want).
You can just use a generator expression. Haven't checked with the new optimizations. But in the past basically comprehensions were always faster because c and optimizations basically.
Filter is a generator. It makes code more efficient, look video on itertools library.
Your videos are simply perfect, easy to understand, quick and simple.
Good job, keep it up👍👍
List comprehension ftw 😎 Really nice explanation!
Damnn someone making non beginner content, great work mann
Thanks for explaining why behind in each snippets
I like your font ! What is it ?
1:36 Very easy to read and understand, not on my watch!
Agree, numpy or pandas are faster, but add extra learning curves. This video is a training. Computation on a big list of number, if needed to be stored in memory, I would do it with numpy. Otherwise I will prefer a generator.
I don't use Python, but Julia has this too, and it's amazing to use (especially with Julia's actual support for multidimensional arrays).
Thanks, I'm just a Python beginner but I find these videos what's happening in the background very interesting. Many times I wonder why it is doing this or that and the answer is some call stack or explanation like this.
Awesome content. Quick question, what tool do you use to animate your videos?
How would the time change if instead of using append, which has to sometimes reallocate another buffer, you just did something like my_array = n*[None] and then my_array[i] = x
I tested it and posted the results in a different comment. Short version is that your method is an additional ~17% faster than list comprehension.
Great explanation, thank you
Well, thank you for making another video that we can learn from.
If I can ask... What's your VSCode theme? It looks sick
The colors look like typical Darkula or Dracula - not sure what's available in VSCode, but I assume one of those would be there.
what font do you use?
Great explanation!
Your content is so good, learn new thing everyday
Amazing video, with a great explanation!!
wow, super useful, I was always asked myself if there are performance differences...
I remember doing this experiment and was amazed how much difference there was also the operator ** on list is also really fast
I'd love to see if there were any difference if the list in the for loop example was preallocated. Usually, when increasing the size of a list eventually the lists memory has to be reallocated to a larger block, which of course takes some time. If the list was preallocated this step would be faster
Yeah, I'd be interested to know how a more typical optimized approach compares. In C, for instance, you'd allocate the array once with its known size, then loop to assign the values, which is simple memory assignment to a known address with no overhead of function calls. I wonder if looping through the list in python with element assignments would have similar or even better performance than the comprehension version, which still is going to run into reallocating memory.
So I went ahead and tested this theory. The exact sample functions in this video, plus a "preallocated" version which is initiated using a for loop. I evaluated for various values of N, stepped by 50K all the way up to lists of size 10M, 10 averaged samples per function.
Results:
~0.1067 seconds per 1000000 elements for the for-append loop
~0.0800 seconds... for the list comprehension (~25% faster)
~0.0661 seconds... for the preallocated for-assign loop (additional ~17% faster)
All of these scale very linearly with the size of list. So the C-style version gives pretty significant additional time savings, at least for this very simplistic task.
P.S. The graph shown in the video is pretty nonsensical. If anything, it looks like it's showing a shallower coefficient for the for-append loop.
Also, I timed things with time.process_time(). The video shows him using perf_counter(), which isn't great for showing code efficiency because it's the total elapsed runtime including any time the process spent sleeping.
@@MG-xn4ug thanks for taking the time to test it! Interesting how the for loop is now faster!
dude I knew this video was wrong. That's why I looked for this comment!
What font do you use? Looks neat
Your python game is raising, this is good content 👍🏻
cool and neat idea to toss around!
Do you recommend using a for-loop instead of a list-comprehension for more complex tasks just for the sake of readability or is there some trade-off at some point?
Just FYI:
You can even use list/dictionary comprehension as parameter in object or function!
yes you can
I don't really like python but list comprehension is both performant and syntactically interesting ! Nowadays, zig is more low level than c
Nice video, thanks for that! Where can I look for "best practices" or how to write more efficient code? I'm learning about these topics in Pyhton now, but it's very hard to find libraries and content...
Great video!
My perspective is that if you have to sacrifice readability for performance in python you are generally using the wrong language for the problem.
Need a language that is easy to learn, get started with and make to make small to medium sized prototypes in? Then python is a good choice for a lot of people.
Need a language that is fast and flexible? Then go with something like c, c++ or rust. If its just some small part you can make bindings. Even fairly simple c++ code can sometimes be thousands of times faster that the equivalent python code.
Theres always a middle ground of ease of use Vs performance. It doesn't have to be black and white
I have one friend who insists on using list comp for everything even if it makes the code nearly unreadable in the long run. He's not dealing with large enough lists for list comprehension to really matter speed wise and just does it cause he can
very good video, love it !
I learnt list comprehension like 2 days ago, though im new and it gets confusing sometimes
But! Its so cool and i love it
Is this more efficient in dict and set too ?
I totally love the video but the music is driving me crazy 😂. Of course, to each his own, the rest might like it, I just would like to have the option to listen to my own music or to go for silence whenever I feel! 😇 Anyway, thanks for the video, brilliant stuff. 👍
Consideration: what if you set t = result.append and call t? This should be much faster
0:35 just do return list(range(n))
please do more of these under the hood videos, very interesting
i write list comprehensions in multiple lines with indents so they're pretty readable even when decently complex
this is the way
Both Black and Ruff format it this way for you, so it's very readable
Why doesn't the python compiler detect you're only using the for loop to append and change the bytecode to a list_append instruction?
Does anyone know his vs code setup? Theme, font, etc.
Great video, and awesome animations!
To further prove the point in the video one could do the following:
def for_loop_preloaded(n):
my_list = []
# pre-load a reference to the append method as to avoid the "LOAD_METHOD" within the for loop
append_method = my_list.append
for x in range(n):
append_method(x)
return my_list
Testing this we can see that it's quite faster than the "for_loop" function but still slightly slower than list comprehensions!
That's because it still needs to perform a lookup for the "append_method" variable each time, unlike a list comprehension that creates and uses an anonymous list (which is typically stored in a variable and/or used as a function argument after the list is complete).
Can you please do a video on cgi using python and html forms
It's very simple: the difference is because of append. If you would compare the for loop using something like "result = [0] * size_needed" and then enter the for loop and just index each result instead of appending, the performance would be very similar.
Shouldn’t be. Appending in python is constant if I am not mistaken.
Great! Another thing not taught at college! Glad I took Bio and Chem!
What library did you use to create the plot?
What font is that
How does the type specialization in python 3.12 change this?
How did you get experience on python internals?
I tend to write list comprehension within list comprehensions.
Also, fun fact, there are only dictionary, set and list comprehensions in Python and they all can have comprehensions nested within each other each other.
The one mystery that I want explained is the following example:
num1 = 1
num2 = num1
What on Earth is the point of doing that? As far as I can tell, the only thing it does is make num1 and num2 point to the same object making num2 redundant. I get annoyed whenever I come across this in someone else's code.
How did you make that graph? Did you make it on pythyor any other thing?
Great!! 👍👍👍
I wonder how this performs compared to a list(map(lambda x: x, range(n)))
Isn't it moreso that append keeps being called and not that the list comprehension is actually faster
I'm surprised that the append method isn't just inlined to prevent the need for a method call. I assume that append does something similar to just the LIST_APPEND op code, so such a small method should be optimized. Java is similarly high level, with compiling to byte code and running on a virtual machine (the JVM), but it optimizes method calls vs inlining automatically during compilation, based on the method's complexity. In a couple of other languages that run on the JVM, you can even specify explicitly if you want to inline a method or not (with some added benefits surrounding generics)
`list.append` is implemented in C. I don't see how this can be inlined. Also, it's difficult to prove that `my_list` is actually of type `list`, even though we declare it as such above. That's partly because Python doesn't have a static type system contrary to Java and `my_list` could get modified through, for example, threads.
Java is not similarly high level, it gives you way more control. Most notably, java is statically typed, Python isn't. The compiler can't know that the variable is actually a list, so it ain't be inlined. If you want stuff like that, look at Cython.
@@megaing1322 "Java is not similarly high level, it gives you way more control."
it is just as high-level if not more. Having more control doesn't make it any less highlevel. Being incredibly slow and without checks is NOT a trade of highlevel-languages.
@@ABaumstumpf higher level = further from the actual CPU. In Java, you have direct accesses to varies low level types like different int sizes, a choice between float and double and arrays are exposed way more directly than in python. Yes Python is higher level than Java. That doesn't say anything about the quality of either. But are ofcourse still high level language, anything at a level of C or above is.
@@megaing1322 "In Java, you have direct accesses to varies low level types like different int sizes, a choice between float and double and arrays are exposed way more directly than in python. Yes Python is higher level than Java."
No, that shows that python is a dynamically typed language. and that was a thing even 50 years ago in some low-level languages.
The concept of high/low-level only makes sense when talking about which operations and idioms a language supports.
Python does not support low-level programming, Java also not really, C++ does.
python does support high-level abstractions, so does Java, so does C++.
And no, not "anything above C" cause C is also high-level depending on the environment and what you are doing. Not as highlevel as many modern languages of course, but you are no longer restricted to bare-to-the-metal code.
And btw: java has arbitrary precision numbers, and Python has floats (which usually are just C doubles) and complex (2 "floats"), and until Python3 it also had 2 integral-types.
having more options available does not "make" a language lower-level.
A very comprehensive Video🫶🏼
I've heard comprehensions will become comparatively even better in 3.12
What is being optimized is more when comprehension are being called a lot. An individual comprehension's iteration (which is more or less what is being tested here) wont change that much.
python is so cool real not fake.
List comprehension are not pythons idea
@@alang.2054 python still cool not fake
Real
real
@@alang.2054So where did these ideas come from? What are other languages that employ list comprehension??
It's always funny how whenever someone explain why X is faster then Y in python, it always come down to the actual C implementation of X and Y.
It's like python is just a C wrapper
(I use rust, btw)
It quite literally is
I mean, rust is just an assembly wrapper.
@@megaing1322 yup, this is technically correct. Even though i want to see anyone able to write rust like code in assembly lol (well i guess python would be the same, as i guess the underlining C code is hard af)
Every programming language is a wrapper for machine code, if you really think about it
@@vinylSummer no. Machine code differs greatly based on compiler, platform, etc.
you should test each function on a separate file for transparency.
Would this be O(n)?
both are O(n), you can see it's linear in the graph
Good information
Theme?
bro, please tell me which theme you are using...............................
This is interesting, although I always feel that if you're trying to do these sorts of optimizations to Python code, you've picked the wrong language
Sometimes it isn't possible to change language for various reasons and Python may be forced on you and your team, but it doesn't mean that performance never matters at all.
I feel the same way. I primarily use C++. I've done it for 30+ years. I don't understand why the industry has coughed up a language that does for the most part the exact same thing as C++ (Loops, arrays, variables, classes) just a lot lot slower... Why drive a Pinto when you can drive a Ferrari?
Supposedly C++ is difficult... I don't get why people feel that way. I picked C up when I was 15ish by reading a book. I started picking up C++ a couple years later. This was long before RUclips, Google, StackOverflow, Udemy, LeetCode.
Just about every language boils down a few basic concepts.
1) Sequential execution of instructions.
2) Loops or conditional branching (JMP, JLE, GOTO, IF, WHILE, FOR, DO , etc.. it's all variants of the same concept)
3) Storage in Memory, for example variables, (a , b, count, i, j, numDaysInMonth, numGoalsScored, x, y, z, etc..)
4) Storage in Multiple chunks (bytes) of memory (malloc, alloc, new , free, delete, smartpointers, std::vector, lists, dictionaries, tuples, etc...)
Now OO languages have things like classes, inheritance, polymorphism, but that stuff isn't overly complicated. Python has classes. Most of the time when I write Python, I use classes, I'm just used to it.
It's all the same crap. I've programmed in LOGO, GW-BASIC, ASM, C, C++, C#, Visual Basic, Perl, Python and probably a few I'm forgetting. I stayed away from Java thankfully. All of these language boiled down to the same core concepts listed above, no matter what problem you were trying to solve.
So... if you're going to spend a buttload of time writing a bunch of code, why not do it in a language that runs 30x faster??
Honestly, If you have a language that has 3-4 different ways to do a for loop and they all have different performance characteristics that warrant making RUclips videos about it. Well I think that's just a fundamental problem with the language itself and is something that really never should have ended up that way.
I think that it might have to do with array resizing. I do not know so much about python but in C/C++ it takes a lot longer to create a new array an refill it than creating an oversized one and just adding to the corresponding index.
No, that doesn't happen here. It would in theory be possible, but as you can see from the bytecode sequence, none of those preallocate elements. `list.append` ofcourse is smart enough to correctly scale preallocation to make append O(1), but that is not a difference between the two versions. However, this would happen for example if you call `list(range(n))` instead.
@@megaing1322 Yes, the cost here is the function call. Most compilers will inline the function call in contexts like this. I know in C++ most compilers would inline this function when used in a for loop, especially if the function is a template. It's a little strange the Python compiler chooses not to inline this. But, then again, usually these JIT type scenarios do very little optimizations. And I imagine the dynamic typing of Python might force this to not be inlined
@@lucass8119 You clearly have no idea how python works. It isn't possible for the compiler itself to inline list.append. And CPython (which is what is being talked about) currently has no JIT compiler.
@@lucass8119 " the cost here is the function call."
At least the data from this video would heavily imply - No, that is not correct and not the source of the differing performance:
Here the 2 lines are almost parallel with a constant offset (the for-loop even slowly catching up). That can not be the result of an overhead that is incurred repeatedly
@@ABaumstumpf The overhead isn't incurred repeatedly, its a constant time factor. Therefore, the two lines being parallel makes perfect sense.
Its not like the second function call is more expensive than the first and so on. Each are equally expensive (theoretically) so the lines should be perfectly parallel, with the one with a function call being slightly slower. Both are O(n), they should be parallel.
We know it has to be the function call, because look at the disassembly. That's the only difference.
Is it a toxic workplace if your boss tells everyone to make the codes faster even if we sacrifice readability?
Our layoffs have calm down and we do printout documentations of our system but I worry for the future employees and want to ask if I should step up and say that we should not sacrifice readability for speed.
My view here, on readability, list comprehension doesn’t make code unreadable, if people are used to code in python. It’s event sometimes more simple. In fact, I will say it depends on the complexity of the computation. I would say do not use list comprehension if there are side effects.
@@olivierbouchez9150 I'm talking about it in a more general sense.
I really appreciate your perspective on it. Thanks and have an awesome day/afternoon/night. :)
@@aizenvermillion434 Code speed isn't opposite of readability. Readability is an issue of either telling the truth about the use of your variables/functions, or not.
Opposite of code speed is development speed. More you optimize, more difficult it will be to do small changes to the program, and vice versa. Development is about making the best product (code speed) or improving on that product (development speed )
If your job requires you to do nothing else than to write a fast program, you shouldn't even care about readability. But if you need to do any amount of bugfixing or testing, you require a balance between the speed of the executable and the modularity of the code.
@@benshulz4179 Thanks for the reply.
An update on that. We got a new lead programmer on our team and he explained to the bosses better than we could.
We got to code better than before at a good pace during projects now. We finally had the time to test them out and let the apps leave without errors unlike with our previous lead and manager.
We wanted to be given time to code properly and give quality products but could not do so under the previous guys because they'd push us to finish as fast as possible and when complaints came we'd get the blame while the previous lead programmer dunks on us further.
Sorry ranting in the end. I'm glad we got a new lead programmer that actually leads our team.
Youre like fireship but python
so its the append which fucks up stuff. i need to check out if using variable 'i' and not using any variable '_' makes a difference when working with large numbers. let me know your finding too brother!
The real question is why the compiler doesn’t make the same bytecode for both options
For this specific case where no added condition cant you simply return range(n)
Nope - not been able to do that for a long time. Range returns an iterator. You could return list(range(n)).
@@tokeivoi was fast to assume it retuned a list after printing it, thanks for the info
How to plot like this?
Can i do list(range(n)) ?
If you need performance you have to use numpy
"python" and "efficient" in the same sentence is crazy.
Tell me ur shit at python without telling me ur shit at python
List comprehensions are basically just mathematical notation. It's beautiful.
Actually in python 3.12 they made comprehensions about 2 times faster than 3.11
Great to hear
Can someone tell me what IDE this is, I use Jupyter notebook.
vs code
The best channel for people like us who work extensively in python
what about mapreduce
For better performance you should define your array size (in current task it’s n). Pre-defined array doesn’t waste time on expanding itself.
I agree. When you use list comprehension, you know the size of the array beforehand. Right now it seems unfair
This isn't possible with python lists without writing manual C code. And list comprehensions don't do such an optimization.
@@koktszfung Sure, you might know the size beforehand, but Python's list comprehension don't use that information. (for that, use `list(range(n))` instead)
@@megaing1322 It’s not possible to define strict size, but you can avoid list resizing (which by the way all in all takes O(n)) by simply writing arr = [0] * n. By default python list has size of 10.
@@ГлебГолубев-ч7щ Yes, list resizing all in all takes O(n), each individual append however is O(1) (see amortized cost). So by prefilling the list you gain nothing in time complexity, and I am not sure if you gain a measurable difference is actual performance.
I would say "somewhat more efficient".
i don't know why it up to 4 peoples who dislike this video ;-;
It really a good video!!!!
I personally disliked it because the title didn't say video was looking specifically at python implementation. I clicked it for algorithmic explanation
It would be interesting to compare performance between list comprehensions and generators.
[x**2 for x in range(n)] compare to (x**2 for x in range(n)) of course there is a difference in the moment the values are computed. Generator could be the best choice to avoid the list loaded in memory, but values available on need. Switching list to generator is a way to make code efficient.
my_list = [*range(100)]
is the actual way for [x for x in range(100)]
The lines seem to be converging, so that means that at some point, a for loop might be more efficient. :^)
Nice video, But to my side, the list comprehension is taking more time than the for loop.