python is removing the GIL! (PEP 703) (advanced) anthony explains

Поделиться
HTML-код
  • Опубликовано: 15 авг 2023
  • will this be another python 3 disaster? what all needs to change?
    - why remove the GIL? • why remove the python ...
    - immortal objects in python -- • refcount ONE BILLION? ...
    playlist: • anthony explains
    ==========
    twitch: / anthonywritescode
    dicsord: / discord
    twitter: / codewithanthony
    github: github.com/asottile
    stream github: github.com/anthonywritescode
    I won't ask for subscriptions / likes / comments in videos but it really helps the channel. If you have any suggestions or things you'd like to see please comment below!
  • НаукаНаука

Комментарии • 172

  • @mrswats
    @mrswats 11 месяцев назад +100

    If python is removing the GIL means it's not a snek, its a physh.

    • @mrswats
      @mrswats 11 месяцев назад +8

      All joking aside, this is good stuff! Would love to know more about how to write safe threaded python even if we do not end up without the GIL.

    • @legion_prex3650
      @legion_prex3650 10 месяцев назад +1

      @@mrswats If you wanna learnt about Threading, then dive into the low level threading of python (and use higher level apis like the Threadpool Executor later on) and have a look at semaphore objects, locks and mutex for example.

    • @MrHaggyy
      @MrHaggyy 10 месяцев назад

      @@mrswats I don`t think threading is a topic most python developers need to care about. It`s like datatypes an implementation-specific tool that uses parallel execution whenever dividing and copying or initializing the problem on multiple cores is significantly faster than a sequential execution. The most prone example is a simulation with a bunch of initial conditions. Running one set is completely independent of the rest so it doesn't matter which core does it. Another prone example is graphics where you have to load images, calculate a 2D projection on the screen, and polish the image with post-processing like lighting. For this type of problem, you set up a chain of computation in individual threads. One thread gets a pointer to the data as well as a trigger signal, does its thing, and wakes the next thread in line. That way you can start the next iteration of the chain before the previous one ends and cut down latency as much less waiting is happening.
      Python uses C-language threading heavily in for example networking. The interpreter uses a module that is for example a C implementation of TCP/IP that uses multithreading for multiple connections. The interpreter only interacts with that module, so it sees data sent as well as complete and correct data received. That way your code stays clean and supports a loose syntax that allows for abstract high-level thinking, and all the implementation-specific details that matter in optimization and performance are solved in a best-effort approach for you.

  • @ballman_fm
    @ballman_fm 11 месяцев назад +50

    I remember Guido discussing the GIL, basically concluding that it's not easily going to come off (maybe with Lex Fridman?). Now I see why that's the case. So many changes, slow downs and overcomplications of existing code.
    This optional build solution also seems like a bit of a complication. I'm afraid this will cause some more fragmentation with respect to dependencies
    Edit: typo

    • @niks660097
      @niks660097 Месяц назад

      There is no fragmentation, Threading API hasn't changed, so there is zero code change.

  • @NiCo-gw2dh
    @NiCo-gw2dh 11 месяцев назад +96

    If it changes so much and makes some things incompatible, they should release it as a new major version Python 4.0

    • @nexovec
      @nexovec 11 месяцев назад +11

      That way we'd have python 10 already

    • @themartdog
      @themartdog 11 месяцев назад +18

      @@nexovec beyond python politics, I think it makes sense to have a separate implementation vs a major version here because you would only want to use the non-GIL version when trying to squeeze performance out of threads and you would want to keep the GIL in all other circumstances, especially single-threaded apps (i.e. most apps).
      In other words, making a major version would imply to developers that new projects should want to always use the GIL-less version, when that is definitely not the case.

    • @andrebruns4872
      @andrebruns4872 11 месяцев назад

      @@themartdog but what advantages do I get from a global interpreter lock? Doesn't quite make sense to me, but maybe I dont know yet

    • @dealloc
      @dealloc 11 месяцев назад

      @@andrebruns4872 Some advantages of GIL is that it's faster single-threaded execution, integration with non-thread safe C libraries, easier to work with than lock-free which otherwise would require you to manually manage fine-grained locks when mutating/accessing global and shared data. So it also benefits from faster multi-threading for I/O bound programs, and CPU-bound programs that does their intensive work in C libraries (Tensorflow/NumPy, etc.)
      You can still parallelize Python programs by running them as separate processes, each having their own interpreter an in turn their own GIL.
      Note that potentially blocking or long-running operations such as I/O and number crunching in NumPy happen outside of the GIL, so the only bottleneck happens for multithreaded programs that spend a lot of time in the GIL, interpreting CPython bytecode.
      Also there are other Python implementations that does not have the GIL, such as IronPython. This proposal (PEP 703) is specifically for making the GIL optional in the CPython implementation.

    • @bearsaremonkeys
      @bearsaremonkeys 11 месяцев назад

      @@andrebruns4872 simplifies things for the user and reduces errors resulting from threads not syncing properly. It also makes a bunch of C libraries for python thread safe, which aren't natively

  • @rosmelylawliet3919
    @rosmelylawliet3919 11 месяцев назад +9

    I agree w/ the pessimistic prognosis, but OTOH, I don't think anybody is expecting this to work right out of the box. I really believe that the benefits from these changes will be seen in around 5 years, maybe less in some cases.
    But that's not bad, because in 5 years, whenever we get the benefits, they are going to be great!!
    I don't like the 5 to 8% of perf decrease, but that is just now. I'm confident it will improve in time, so we will get to a point where we either get massive benefits of no-gil, or a catastrophic failure were we are left in the same place as with the gil, which means not that bad for users (catastrophic for those who invested a ton of work into this, of course).
    So all in all, for us users, the future is kinda bright. Not the near future, but the far one.

  • @float32
    @float32 11 месяцев назад +37

    Going from C++ to Python, my first WTF of the language was that my threaded code ran roughly thread times slower, rather than thread times faster. I still can’t believe it’s used as much as it is, but I type it every day.

    • @brunojambeiro6776
      @brunojambeiro6776 10 месяцев назад +7

      With you combine it with other tools such as numba and numpy the code could actually free itself from the GIL and make use of multiple cores. Worked quite well when I tried it.

    • @float32
      @float32 10 месяцев назад +1

      @@brunojambeiro6776 yes, when you stay away from python as much as possible, when using python, it is pretty great. ;)

    • @GRAYgauss
      @GRAYgauss 9 месяцев назад +1

      Every tool has it's uses, of course if you misuse a tool it's going to seem bad. Most cases where threading appears desirable in Python, like non-blocking, you should be writing async instead. If you're looking for performance, external/c libs first then mp/executors and finally port...Honestly though, it's really hard to have performance concerns if you're leveraging Python as outer loop glue correctly and thus question if it was Python's fault or yours. Prototype and first draft in Python, avoid premature optimization, but then port what you can't import instead of being a monkey trying to write it all in Python. I don't really know your use case, but most of my problems the usual scilib suspects don't solve are suitable for CUDA/GPGPU and if I really have to I can just write CUDA/C python module or even Rust/Vulkan pyo3 module, but it's trivial nowadays to write a GL compute shader and execute with a Python runner. It's probably less than 100 lines of Python and shader code to set up your pipeline, compile and display 150k unoptimized neighborhood calcs in realtime on cheap hardware. Point is...Yeah Python isn't very fast, but if you don't misuse it, it spends cputime to save developer time and you can optimize when necessary rather than optimizing core loops nobody cares about.

    • @Michallote
      @Michallote 2 месяца назад

      ​@@GRAYgaussmy friend you say some funny words. But I admire you. Running shaders when numpy fails me seems like something I would like to learn. Torch is impressive on it's own right.

    • @GRAYgauss
      @GRAYgauss 2 месяца назад

      @@Michallote Nothing to admire, just have fun and keep learning.Tbh once you learn a bunch of paradigms and math things transfer. If you're interested, you'll be fine. Try to hone yourself though, come up with curriculums, don't get stuck "learning in place" or "learning to the appeal of desire." Learn to do something, do something to prove you've learned something. Rinse, repeat.
      My big mistake was never knowing where to go, and learning a little about everything - it did get me a job and I happen to be skilled in everything they didn't know they wanted, so it worked out. I did spend close to 2 decades studying with no goal in mind though.
      Anyways, check out arcade(or straight pyglet) or rust/wgpu for writing shaders/pipelining, triton is cool, from there could move to pycuda...maybe you end up in rust using pyo3 and ash to pipeline modern compute in straight vulkan in 10 years...

  • @Monotoba
    @Monotoba 10 месяцев назад +3

    I too think that this wont get a good foothold. However, I do see it as a great learning experience for the developers of python. An experiment that will give them better insight into the issues (and perhaps solutions) to leaving the GIL behind in the future. I also support the release of this as a new major release, or at least an experimental release version. During which time they should leave the current version with the GIL in place clean of all modifications needed to remove the GIL. I don't think removing the GIL and slowing code will kill python, but needing to recompile all those higher performance and long standing C libraries will certainly cause a dip in popularity as many of the long standing C extensions may never be re-compiled!

  • @ManInSombrero
    @ManInSombrero 11 месяцев назад +3

    Thanks for the explanation!
    Anthony, you say you are pessimistic about it - but what's the alternative? Do you think it's better to leave it as is or there could be a better solution than the ones described in the PEP?

  • @n0ame1u1
    @n0ame1u1 10 месяцев назад +1

    Seems interesting. I'm worried about slowdowns, but creating it as another "flavor" which can be improved over time is a good move

  • @haxwithaxe
    @haxwithaxe 10 месяцев назад

    The c recompile thing reminds me of the 2to3 transition with c based libraries

  • @joshsnyder5882
    @joshsnyder5882 11 месяцев назад +7

    Maybe you could make another video contrasting this approach with the PEP 684/554 approach of multiple interpreters.

    • @evanjeffrey9677
      @evanjeffrey9677 9 месяцев назад

      The multiple interpreters stuff is cool, and may lead to some interesting concurrency approaches, but for now they have dropped the shared objects and channels from the proposal, so communication between processes is back to pickling stuff and sending it through a pipe or using manually managed shared memory, similar to the limitations on multiprocessing. I expect we will see some of those features being released as libraries, and possibly being included in the standard lib over the next releases, but for now it's not really addressing the performance issues with multiprocessing.

  • @dropjesijs
    @dropjesijs 11 месяцев назад +4

    For what i understand is that Guido will not accept any "gill less" version of Python where "single threaded" code will slow down. So then I would conclude that it depends if this pep can accomplishe this, if this will be implemented in the main version of python.

    •  11 месяцев назад +6

      "On 12 July 2018, Van Rossum announced that he would be stepping down from the position of BDFL of the Python programming language"
      That's 5 years ago.

    • @HTH565
      @HTH565 10 месяцев назад +1

      Also the steering committee that replace Guido has already said they will probably accept it

    • @dropjesijs
      @dropjesijs 10 месяцев назад

      ​@I am late again but.. I must admit I forgot that little fact. Thanks for the reminder!

    • @banatibor83
      @banatibor83 10 месяцев назад

      Like it would matter, python is slow AF anyway.

    • @Michallote
      @Michallote 2 месяца назад

      It's slow but fast eh

  • @digiryde
    @digiryde 10 месяцев назад +1

    Ripping the guts out of any system is always a "Thar Be Dragons" process.

  • @haxwithaxe
    @haxwithaxe 10 месяцев назад +1

    Most of the time when I'm using the threading library I'm basically doing tasks that could be handled with async. I suspect a lot of us don't really need to remove the GIL even though we use threading.

    • @evanjeffrey9677
      @evanjeffrey9677 9 месяцев назад

      Even if you are using threads for code that mostly does IO that releases the GIL already, chances are good that your GIL requiring python code is still a bottleneck. Same with a lot of C extensions -- while they can do some work with the GIL unlocked, every time they modify a python object they have to reacquire the GIL which causes a lot of contention and limits effective concurrency. The way I look at it is this: a lot of people cite the 80-20 rule, or maybe the 90-10 rule. 10% of your code is 90% of your execution time. The problem python has is that executing python code it is roughly 20-50x slower than native code. So if you move your 10% critical path code to C/C++/Rust, the 90% "non-critical" code is still limiting you, albeit at a somewhat tolerable level. However, common desktop and laptop CPUs have 8 or more cores. If you now rewrite your code to use concurrency, your native code gets 5x faster, and you are completely bottlenecked on the "non-critical" python code that can still only run slowly on 1 core at a time.

  • @alexandrugheorghe5610
    @alexandrugheorghe5610 10 месяцев назад +1

    14:10 is this what happens when using del? It's left for garbage collection or does it invoke immediately some procedure to take care of it?

    • @anthonywritescode
      @anthonywritescode  10 месяцев назад +5

      del is not an imperative delete -- it essentially just decreases the refcount and sets the variable slot back to unset

    • @alexandrugheorghe5610
      @alexandrugheorghe5610 10 месяцев назад

      ​@@anthonywritescodethanks!

  • @alskidan
    @alskidan 11 месяцев назад +5

    Small things are allocated 😂 like an 80 byte integer object 😅

  • @ehza
    @ehza 11 месяцев назад

    Thank you for the video. You're amazing!

  • @megaing1322
    @megaing1322 10 месяцев назад +1

    There appears to currently be a large effort to implement a gcc plugin? extension? Within the gcc mainlinglist (so by core maintainers of GCC, or at least seen by them) to help with CPython extensions, especially reference counting. I would assume they will be able to account for the nogil world.

    • @anthonywritescode
      @anthonywritescode  10 месяцев назад

      there's been a pretty good GCC plugin for a few decades -- wouldn't surprise me if it needed a few changes to understand nogil though

    • @megaing1322
      @megaing1322 10 месяцев назад

      @@anthonywritescode It's being completely rewritten right now, independent of nogil. It's no longer cpychecker using the "Python GCC plugin", but instead an -fanalyzer plugin fully integrated into gcc.

    • @anthonywritescode
      @anthonywritescode  10 месяцев назад

      oh nice, that's going to be awesome!

  • @JohnWilliams-gy5yc
    @JohnWilliams-gy5yc 11 месяцев назад

    I just strongly hope the mimalloc won't have its own Stephen Elop moment.

  • @paperC_CSGO
    @paperC_CSGO 10 месяцев назад

    I would more web security videos in the explain-series. The only one I found was csrf

  • @slr150
    @slr150 10 месяцев назад

    So are they going to have happened-before ordering or expose memory fences?

  • @sourabhk2373
    @sourabhk2373 11 месяцев назад

    I just got the same keyboard but without the lifting kit. Do you think its worth buying ? Its quite expensive (here in india).

    • @anthonywritescode
      @anthonywritescode  11 месяцев назад

      I can't really use it without the lift -- would recommend

  • @Carbonator5000
    @Carbonator5000 5 месяцев назад

    I really have a dumb question about this … can’t you do something close to this by using the sys/os modules to thread using the OS?

    • @anthonywritescode
      @anthonywritescode  5 месяцев назад

      there are real threads in python, but no there isn't some magic that just makes something like that work

  • @cmilkau
    @cmilkau 9 месяцев назад

    Woah FINALLY. How many years has this been in the making now?

  • @ThankYouESM
    @ThankYouESM 8 месяцев назад

    Because I love writing in Python, but... annoyed that it is too slow for many of my goals, so... I decided to learn just enough javascript (again) for Python to be converted to the nearest I can... instead of using PyGame... NumPy... etc.

  • @dankprole7884
    @dankprole7884 11 месяцев назад

    This all seems way too complicated for anything I do but interesting video nonetheless!

  • @zyxyuv1650
    @zyxyuv1650 11 месяцев назад +4

    I've been waiting for half my lifetime for Python to finally stop being so conservative and to finally dare to progress forward with this excruciatingly slow piece of garbage where less than 1% of the language features dominate, bottleneck, and destroy 99% of the entire language's performance.

    • @alexandrugheorghe5610
      @alexandrugheorghe5610 10 месяцев назад

      Hear, hear.

    • @CTimmerman
      @CTimmerman 10 месяцев назад +2

      What was the bottleneck? Didn't multiprocessing allow 100% CPU use and fancy AI libs 100% GPU use as well?

  • @user-pw5do6tu7i
    @user-pw5do6tu7i 11 месяцев назад +1

    i wonder how this will affect Flask

  • @Chris-ty7fw
    @Chris-ty7fw 9 месяцев назад

    So does python use a perfectly strong memory model vs C's support for weak and strong and javas weak memory model , presumably very strong would have to be used not break legacy code but at the cost of performance depending on the actual physical hardware.

    • @anthonywritescode
      @anthonywritescode  9 месяцев назад

      I don't think you can really classify it simply like that -- especially because python can call native extensions where their own memory models are at play

  • @walkdead94
    @walkdead94 11 месяцев назад

    If there are 2 flavors it's all good! I had to try go with the workaround to disable the GIL manytimes for specific projects, with 3.13 I know I don't need to care any more.. just buy the new flavor and
    we are good to go! The others unthread safe projects remains with normal GIL Python!

    • @Pabna.u
      @Pabna.u 11 месяцев назад +1

      Well, they say their eventual goal (if things work out, and maybe 5 years) is to remove the GIL by default and get rid of any vestiges of the GIL if possible. They want to avoid the extra work for the community from having to maintain two flavors

    • @alexandrugheorghe5610
      @alexandrugheorghe5610 10 месяцев назад

      ​@@Pabna.uthat'd be great 👍🏻

    • @Michallote
      @Michallote 2 месяца назад

      Hello! Hey do you actually disble yourself the GIL for python projects? :0
      If so, how do you go about it? You fork python and modify the cython implementation or smth?
      And also, have you tried then using another language that is more suited to the task? I think Javascript is the low effort alternative to python when it comes to having non blocking calls. Idk about multiprocessing for intensive computing tasks.

  • @DavidDellsperger
    @DavidDellsperger 11 месяцев назад +11

    The irony of python (a snake) removing the GIL (you know, like gills on a fish) is not lost on me these days

  • @carddamom188
    @carddamom188 9 месяцев назад

    It depends on how you use threads, if python uses something like tasks or lightweight threads where n tasks are mapped to m threads and n>m together with work stealing and a parking thread for blocking io or even async io, then it is a good change, everything else is hot garbage at this point...
    Also, reading the use cases, they are mostly related with AI, meaning that probably we will get trouble down the line, when people try out the other uses for python like webdev, system scripting, system utilities ( all of rpm and dpkg based systems like Ubuntu, Debian, Fedora and RedHat ) that is when s**t is gonna hit the fan... Also, on the meta developers, I would be more worried if meta does not go the way of the metaverse and the whole thing goes bankrupt fast...

  • @Ca1vema
    @Ca1vema 10 месяцев назад +2

    a lot of python devs do not care about thread safety, they will never do even when GIL removed. It's a disaster

    • @2sourcerer
      @2sourcerer 10 месяцев назад

      Why do they use thread in the first place when it is not really parallel? IO bound code!

    • @Ca1vema
      @Ca1vema 10 месяцев назад

      @wacow2 you can use threads to run CPU bound code to introduce non blocking handling. Nevertheless IO bound code has nothing to do with thread synchronization, it can be the problem there as well.

    • @2sourcerer
      @2sourcerer 10 месяцев назад

      @@Ca1vema If it’s CPU bound is there still an advantage with non-blocking IO? Since the introduction of asyncio isn’t that a safer solution for IO bound problems?

    • @Ca1vema
      @Ca1vema 10 месяцев назад

      @wacow2 if it's a webserver which performs CPU intensive tasks, threads will allow you to process multiple requests simultaneously with the cost of processing speed, while Asyncio would block your server during single request handling

    • @2sourcerer
      @2sourcerer 10 месяцев назад

      ​@@Ca1vema Oh! You mean despite that in both cases there are only concurrency, no parallelism. You can't tie up the entire execution in a threading model because it'll still get swapped out in the middle of a CPU intensive task. Got it.

  • @colinmaharaj
    @colinmaharaj 10 месяцев назад +2

    Been doing parallel programming and multithreaded in C++ since 1998

  • @doresearchstopwhining
    @doresearchstopwhining 11 месяцев назад +1

    Seems a lot like mojo's promised features - no gil over there...

    • @CTimmerman
      @CTimmerman 10 месяцев назад

      Is Mojo out yet? How is it better than V, Nim, Julia, Kotlin, and Go?

    • @doresearchstopwhining
      @doresearchstopwhining 10 месяцев назад +1

      @@CTimmerman Out in preview - can use it in jupyter. How is it better? Worth reading the docs. Python is now faster than c++ basically and safety features like rust. Bake that into a language that still has jit compilation and optional loose typing - this is a big deal IMO....

    • @doresearchstopwhining
      @doresearchstopwhining 10 месяцев назад

      @@CTimmerman Julia seems like the only one that comes close but certainly not as powerful as mojo. Basically python with LLVM...

    • @CTimmerman
      @CTimmerman 10 месяцев назад

      @@doresearchstopwhining C++ is fast because it allows for unsafe code that doesn't check bounds for example. So Mojo is safer and slower, or safer and faster in some cases where a JIT has more optimisation info than a static compiler, but a JIT still spends potential performance on compiling during runtime.

    • @doresearchstopwhining
      @doresearchstopwhining 10 месяцев назад

      @@CTimmerman I think it is both faster when compiled and slower when the jit is used. It can go back and forth which is what I think makes it so remarkable. Still got questions though

  • @tompov227
    @tompov227 11 месяцев назад +1

    If python could use threads like how say Java uses threads, I don't know that I would willing write any code that isn't python (except JavaScript bc Web)

    • @carddamom188
      @carddamom188 9 месяцев назад

      Good God, I hope they do not go the java way... As someone working with a java/spring based webapp, that thing is a hog, when there are more than 40 users online... My which is that project loom, finally takes off and makes everyone forget the usual 1 thread per user model, that is so tipical of Java...

  • @jmirodg7094
    @jmirodg7094 9 месяцев назад

    I'm sure it will be a mess for a while but we can not continue to just use one core out of 16.

  • @user-hk3ej4hk7m
    @user-hk3ej4hk7m 11 месяцев назад +4

    It's weird enough to be writing python code that's cpu bound, even weirder to have that be parallelizable where you could get a speedup and it not being worth it to just write a c extension to have waaaay better performance there.

    • @potryaseniye
      @potryaseniye 9 месяцев назад

      What about matlab cpu bound code?

    • @user-hk3ej4hk7m
      @user-hk3ej4hk7m 9 месяцев назад

      @@potryaseniye This doesn't affect numpy or scipy. Both of these modules implement the actual computation in C using simd where possible, with the option of using multiple cores. If you need some very specific calculation like simulation of dynamic systems then you can use numba to JIT compile your code. This only affects pure python code.

    • @potryaseniye
      @potryaseniye 9 месяцев назад

      @@user-hk3ej4hk7m I have a bit of experience developing numerical simulation software on python in academic and industrial environment. It is usually based on numpy and scipy in its core, and also efficient compiled libraries for solving linear systems, meshing, etc. Several times I encountered situations when there is pure python logic which cannot be vectorised by numpy and is too complex to be compiled by numba (requires refactoring thousands of lines of code and making them almost unmaintainable). And due to python it becomes the main performance bottleneck. Ability to parallelise pure python would be a big win for such software. As far as I know, matlab has builtin parallelism for this, but I don’t have too much experience with it. Btw, python is chosen as a language in order to make the entry point more simple for researchers / graduate students, who don’t have a lot of programming experience.

    • @evanjeffrey9677
      @evanjeffrey9677 9 месяцев назад +1

      @@user-hk3ej4hk7m That's not really true. Scientific computing using libraries like numpy and scipy are actually major drivers to remove the GIL. Yes, C extensions can release the GIL, although many of them can't do it as much as you would like since any time they want to access or modify python objects they have to hold the GIL. But even when you can release the GIL, as you add threads to try to get more parallelism, contention in the python code calling into the C extensions often becomes a significant bottleneck, even though it's a tiny fraction of the "work",

    • @user-hk3ej4hk7m
      @user-hk3ej4hk7m 9 месяцев назад

      @@evanjeffrey9677 If the time it takes to run you numpy operations is significantly smaller than the time you're spending in python land then you're probably using the libraries wrong, at that point just write everything in pure python. You should reformulate the problem so that most of the calculations and decision making is done by the extension in c land, after all most mathematical operations are well supported with these libraries, if not then just write your main loops with numba.

  • @spencer3752
    @spencer3752 11 месяцев назад +3

    I'm surprised "mimalloc" is not pronounced "my malloc" as in Microsoft-Malloc.

  • @illker.
    @illker. 11 месяцев назад +1

    no more blocking io, i think backend applications will be faster

    • @CTimmerman
      @CTimmerman 10 месяцев назад +1

      Iirc, CPython already switched to other threads while waiting for io.

  • @TheHackysack
    @TheHackysack 11 месяцев назад

    but the GIL is our friend

  • @hxllside
    @hxllside 11 месяцев назад

    Thats crazy that it locks everything even if no variables are shared at all. I have used ThreadPool before and didn't even know this!

    • @dropjesijs
      @dropjesijs 11 месяцев назад +5

      Use process pools. It will spin up a copy of the code and the interperter. This has a lot of overhead, additional memory etc but for long running computations it could speed up your code. I had some success with that. It realy depends in your code though.....

    • @CTimmerman
      @CTimmerman 10 месяцев назад

      @@dropjesijs Don't the performant libraries already do that?

  • @abstractrussian5562
    @abstractrussian5562 10 месяцев назад

    Finally

  • @legion_prex3650
    @legion_prex3650 11 месяцев назад +2

    I don't know. One the one hand, python is a slow language and number cruching like in C is not done with Python so much. And why then need Threads for? Concurrency IO and Coroutines (asncio) do work with python threading and that's kinda enough for a scripting like language. Concurrent network operations, downloads, threaded socketserver do work, which is totally fine for me. If i want the real speed, i do RUST or C.

    • @CTimmerman
      @CTimmerman 10 месяцев назад

      Is Tensorflow for example faster when called from C as opposed to CPython?

    • @legion_prex3650
      @legion_prex3650 10 месяцев назад +1

      @@CTimmerman Well, you _can_ use the Tensorflow C-API to call Tensorflow Operations (which are in C) from C or any other language. The speed of the tensorflow operations will be the same. Tensorflows Operations (numpy and pandas alike) do not care about the GIL as they are pure C code.

    • @totopopov
      @totopopov 10 месяцев назад

      @@legion_prex3650
      Actually C ain't always optimal or the best performer, numpy under the hood also runs Fortran that is faster than C in some cases. The best feature of python is essentially that it glues different pieces of code to run, not necessarily C or C++.

  • @TheArmyofWin
    @TheArmyofWin 10 месяцев назад

    If Microsoft is behind mimalloc it's probably pronounced "Mymalloc"

  • @kezif
    @kezif 11 месяцев назад +1

    Anyway, who needs gil removal if doing so would change codebase completely, make code slower and break compatability?

    • @Liam3851
      @Liam3851 10 месяцев назад

      I wish the video had made the case! The main use case is this: you have multiple parallel threads which share the same data, and all threads need random access to all of the data. This breaks the multiprocessing paradigm many have historically used in Python to use all their cores (typically using multiprocessing, you'd tell each core to work on a subset of the data; if any data is shared, each CPU needs its own copy). ML and AI workloads often have this characteristic (though the case is not limited to ML and AI); the PEP was written by an engineer at Meta who works on PyTorch. In the ML/AI training case, for example, you may have a Python process running PyTorch on the CPU that orchestrates a series of preprocessing transformations on the data in RAM (say, image or text transformations to numeric encoding, then standardization and other transformations), ships batches of encoded data to the GPU for training, retrieves the result from the CPU and then sends the next batch. It can require all the CPU cores on the machine just to preprocess the data and send results to/from the GPU.

  • @drz1
    @drz1 11 месяцев назад +1

    Would it make sense to fork core cpython into a separate language called something else? Like tpython or something like that. And the gil-less community can continue to develop that for the subset of developers that really need it. That would help delineate things.

    • @alexandrugheorghe5610
      @alexandrugheorghe5610 10 месяцев назад +1

      That would be terrible, IMHO. The resources and efforts of keeping them in sync. ... I don't even want to think about it

    • @drz1
      @drz1 10 месяцев назад

      @@alexandrugheorghe5610 I meant not keeping them in sync

    • @alexandrugheorghe5610
      @alexandrugheorghe5610 10 месяцев назад

      @@drz1 TBH, I wouldn't want to keep updating my copy with the other one. Best is, IMHO, to have a major version (for performance penalty for single threaded applications) in order to signal that thread safe code is needed.

    • @carddamom188
      @carddamom188 9 месяцев назад

      Like pypy?

  • @ralvarezb78
    @ralvarezb78 Месяц назад

    multiprocessing

  • @woolfel
    @woolfel 25 дней назад

    the reality is Guido didn't really understand threading when he started python, that's normal for 99.9% of engineers. Until you spend a few years writing highly threaded applications, it's just hard to understand.

  • @qcktap23
    @qcktap23 9 месяцев назад

    How do other languages implement multithreading? This seems very behind the times, too little too late.

    • @anthonywritescode
      @anthonywritescode  9 месяцев назад

      from similar languages: js didn't (workers) for most of its existence and they're very isolated now, Ruby made a large breaking change to introduce threads at all

    • @qcktap23
      @qcktap23 9 месяцев назад

      @@anthonywritescode so basically they need to be making, async/multiprocessing easier and more streamline?

    • @anthonywritescode
      @anthonywritescode  9 месяцев назад

      I think the answer is more "other similar languages didn't really". async also doesn't really help

    • @qcktap23
      @qcktap23 9 месяцев назад

      @@anthonywritescode I guess I'm confused as to why they're not fully committing to multithreading to make it work or commiting to make other aspects better if they're not going to implement it.

    • @evanjeffrey9677
      @evanjeffrey9677 9 месяцев назад

      @@qcktap23 Basically what has changed is massively multi-core consumer CPUs. While we have had multi-core CPUs for a while now, for a long time Intel made mostly 2-4 core CPUs for their mainstream products. So the potential advantage wasn't seen as worth paying overhead on single threaded programs. After AMD released the Zen, core counts have been going up, and now Intel has both P cores and E cores for more efficient massively multi-threaded operation. So for any application that can effectively use thread based parallelism, a 10% decrease in single threaded performance is easily worth it for a potential 10x improvement in performance. But people with single threaded code would rather not have that cost for something they aren't using.

  • @Ash-qp2yw
    @Ash-qp2yw 11 месяцев назад +2

    Cheer 100 - Hi RUclips

  • @seasong7655
    @seasong7655 7 месяцев назад

    Isn't pypy already without the GIL? They should have made that the new standard

    • @anthonywritescode
      @anthonywritescode  7 месяцев назад +1

      they have a different garbage collection model -- but have a GIL

  • @douggale5962
    @douggale5962 10 месяцев назад +1

    The GIL doesn't help enough to make python code accidentally thread safe. It would screw up already. I remember having to make my multithreaded python just as airtight as if it were C++, as far as races are concerned.

  • @AspartameBoy
    @AspartameBoy 6 месяцев назад

    Just rename the old python Fish.

  • @kamurashev
    @kamurashev 11 месяцев назад

    Tada! Make python into java.

  • @colly6022
    @colly6022 11 месяцев назад +1

    they should just branch off and have python 4.0 be the major version that caters to multithreading and more performance-oriented features / granular control.

  • @kezif
    @kezif 11 месяцев назад

    python 4???? LETS GOOOO

  • @Whatthetrash
    @Whatthetrash 11 месяцев назад +1

    As someone who loves Python and is currently learning Flask to build web apps, I hope this doesn't affect things too much. Why is all this necessary -- for a speed boost? It's Python! Does it need to be faster? Don't get me wrong: faster is great but with Python faster is not the point. Aren't the most important things of the language its simplicity, clarity and dependability? I hope this doesn't mess with any of those things. I really hope all of this is 'under the hood' stuff and the experience of using and building reliable software with Python stays the same. Just my 2 cents. >_

    • @alexandrugheorghe5610
      @alexandrugheorghe5610 10 месяцев назад

      Well, you could still run 1threaded code (and accept a penalty in performance). Though, if you want performance you'd go multithreaded with thread safe code.

    • @CTimmerman
      @CTimmerman 10 месяцев назад

      @@alexandrugheorghe5610 Does that use multiple cores as well as multiprocessing?

    • @alexandrugheorghe5610
      @alexandrugheorghe5610 10 месяцев назад

      @@CTimmerman for that you'd have to use multiprocessing (as mentioned in the video, you can still do it now but keep in mind there will be one interpreter per process and each one will have the gil; obviously when [if] it'll go away, there can be real multiprocessing with multithreading available)

    • @CTimmerman
      @CTimmerman 10 месяцев назад

      @@alexandrugheorghe5610 Isn't multithreading sequential per core?

    • @alexandrugheorghe5610
      @alexandrugheorghe5610 10 месяцев назад

      @@CTimmerman you'd get 1 process per core that can do multithreading, no?

  • @JasFox420
    @JasFox420 10 месяцев назад

    Forking Python into more flavors is stupid. This is bad.

    • @anthonywritescode
      @anthonywritescode  10 месяцев назад

      as mentioned in the video -- this is not new to python. (and it currently already has 2 flavors! in the past there were many more)

    • @JasFox420
      @JasFox420 10 месяцев назад

      @@anthonywritescode well aware - I'm not looking forward to going backwards.

    • @JasFox420
      @JasFox420 10 месяцев назад

      I would accept this better as a basis of the dichotomy between 3 and 4 more like 2 and 3 and make 4 only truly multi-threaded.

    • @CTimmerman
      @CTimmerman 10 месяцев назад +1

      @@anthonywritescode 2 flavors? So Jython, IronPython, Cython. Stackless, Numba, Nuitka, etc aren't?

    • @anthonywritescode
      @anthonywritescode  10 месяцев назад

      @CTimmerman -- this video is specifically about cpython and its flavors

  • @toddnedd2138
    @toddnedd2138 9 месяцев назад

    Finally! Coming from C/C# to Python, I've always found it challenging to write a multiprocess application to utilize the full CPU power. No offense intended, but now all Python enthusiasts will need to write clean code. This will certainly separate the wheat from the chaff.

  • @cmilkau
    @cmilkau 9 месяцев назад

    I really wish people on RUclips would be more explicitly distinguishing between Python the language and CPython the interpreter. It's not the only one!

    • @anthonywritescode
      @anthonywritescode  9 месяцев назад

      if you actually watch the video I do clarify (I'm part of the pypy team sort of)

  • @christopherprobst-ranly6357
    @christopherprobst-ranly6357 10 месяцев назад +3

    With this attitude, of course it won't be successful. Python has a shitty slow runtime just because no one tackled this issue. This could have been solved early, like all other runtimes did. But Python just relaxed and waited 15 years, no the very last one to the party. It has to happen, otherwise, Python will die out as soon as a new glue language manifests itself. Python is used only as glue language mostly because it is simply not capable of anything else. You HAVE to bind to C for even the most simplistic tasks because Python is interpreted and GILed. And about your argument: I think if you don't share C dep instances across threads, there should be no data races. If multiple Interpreters are used, these instances are all fully isolated. You would still get speed up without the risk and the good C lib developers can simply make their code safe finally. Let's forget about old and shitty code, I am fine with that running in GIL-mode. I think this PEP is long overdue and the only rescue for Python to stay relevant in future years, period.

    • @anthonywritescode
      @anthonywritescode  10 месяцев назад +1

      I guess you missed the part where this makes python slower but sure go off

    • @christopherprobst-ranly6357
      @christopherprobst-ranly6357 10 месяцев назад +2

      @@anthonywritescode Python is already pretty slow, can't hurt to be a bit slower if you can win a few 4-16x by properly using multicore. Pretty compelling tradeoff.

  • @idobooks909
    @idobooks909 6 месяцев назад

    Oh just let GPT-4 Turbo handle all the code rewrites.. right?