Multiprocessing in Python: Pool

Поделиться
HTML-код
  • Опубликовано: 18 окт 2024
  • This video is sponsored by Oxylabs. Oxylabs provides market-leading web scraping solutions for large-scale public data gathering. You can receive data in JSON or CSV format and pay only per successful request. At the moment, Oxylabs offers a free trial.
    oxylabs.io/?ut...
    In this video, we will be continuing our treatment of the multiprocessing module in Python. Specifically, we will be taking a look at the "Pool" class, and how we can go about using the Pool class to instantiate tasks that are run across multiple processors on our machine.
    According to the official documentation (docs.python.or...
    "multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows."
    Software from this video:
    github.com/vpr...
    For more videos on multiprocessing:
    bit.ly/lp_multi...
    Do you like the development environment I'm using in this video? It's a customized version of vim that's enhanced for Python development. If you want to see how I set up my vim, I have a series on this here:
    bit.ly/lp_vim
    If you've found this video helpful and want to stay up-to-date with the latest videos posted on this channel, please subscribe:
    bit.ly/lp_subsc...

Комментарии • 114

  • @davidmkahler
    @davidmkahler 6 месяцев назад +1

    I have looked at A LOT of python parallel videos and this is the first one that worked! I also added a print line to the serial function to confirm the output; they were consistent. Furthermore, it worked well and can be adapted easily for more advanced applications. Thank you!

    • @LucidProgramming
      @LucidProgramming  6 месяцев назад

      Great to hear, David. Thank you for watching!

  • @thet00nl1nk3
    @thet00nl1nk3 5 лет назад +8

    Finally a clear tutorial on this! Was finally able to get it to work thanks to you.

    • @LucidProgramming
      @LucidProgramming  5 лет назад +1

      Thank you! If you like my content, I've been working on some projects during the past couple of months. If you would like to stay up-to-date, please consider subscribing to my mail list. Also, if you haven't already, please consider subscribing!
      I really appreciate the support, and if you want to support the channel, I do have a PayPal link
      paypal.me/VincentRusso1
      for donations that go directly to the creation of content on this channel.
      I hope that the content I provide there will enhance the videos on my RUclips page.
      bit.ly/lp_email

  • @ninjahkz4078
    @ninjahkz4078 2 года назад +1

    thanks for the video bro, I was looking for several and several and could not understand as clearly as it was in this video!

    • @LucidProgramming
      @LucidProgramming  2 года назад

      Cheers! If you enjoyed and benefited from my content, please consider liking the video and subscribing to the channel for more content like this. If you would like to support the content creation on this channel please consider unblocking ads when watching my videos as this is how I support my time to make content. I hope to be putting out more similar videos soon!

  • @briananderson8141
    @briananderson8141 11 месяцев назад +2

    This is a phenomenally well made video, well done sir on your presentation and clarity.

    • @LucidProgramming
      @LucidProgramming  11 месяцев назад

      Thank you kindly! I appreciate your support!

  • @cavidanbagiri1884
    @cavidanbagiri1884 5 лет назад +5

    you are best teacher . Greetings from Azerbaijan

    • @LucidProgramming
      @LucidProgramming  5 лет назад +2

      Thank you! If you like my content, I've been working on some projects during the past couple of months. If you would like to stay up-to-date, please consider subscribing to my mail list. Also, if you haven't already, please consider subscribing! I really appreciate the support. I hope that the content I provide there will enhance the videos on my RUclips page.
      bit.ly/lp_email

  • @zapata22
    @zapata22 Месяц назад

    you are a legend! thanks for this tutorial.

  • @franciscon1048
    @franciscon1048 5 лет назад +4

    Nice work!! Thanks for this series of videos. Greetings from Chile

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      Greetings from Canada :). I am happy to have produced content that was found to be valuable by people like yourself.
      If you like my content, I've been working on some projects during the past couple of months. If you would like to stay up-to-date, please consider subscribing to my mail list. Also, if you haven't already, please consider subscribing! I really appreciate the support. I hope that the content I provide there will enhance the videos on my RUclips page.
      bit.ly/lp_email

  • @buffaloofm7119
    @buffaloofm7119 2 года назад +1

    I had problem with real time OCR and QR detection. when using single thread, it takes time save the images and process the OCR and QR which make the screen freeze about 0.3 seconds . so I used multiprocessing and make two process to capture images and process QR OCR and one shared list to put and read numpy arrays (ROI). It works better. Thank you very much.

    • @LucidProgramming
      @LucidProgramming  2 года назад +1

      Cheers! If you enjoyed and benefited from my content, please consider liking the video and subscribing to the channel for more content like this. If you would like to support the content creation on this channel please consider unblocking ads when watching my videos as this is how I support my time to make content. I hope to be putting out more similar videos soon!

  • @saikun0293
    @saikun0293 2 года назад +1

    Awesome explanation!

    • @LucidProgramming
      @LucidProgramming  2 года назад

      Cheers! If you enjoyed and benefited from my content, please consider liking the video and subscribing to the channel for more content like this. If you would like to support the content creation on this channel please consider unblocking ads when watching my videos as this is how I support my time to make content. I hope to be putting out more similar videos soon!

  • @kenmurphy4259
    @kenmurphy4259 4 года назад +2

    Fantastic Python class!

    • @LucidProgramming
      @LucidProgramming  4 года назад

      Thank you! If you like my content, I've been working on some projects during the past couple of months. If you would like to stay up-to-date, please consider subscribing to my mail list. Also, if you haven't already, please consider subscribing!
      I really appreciate the support, and if you want to support the channel, I do have a PayPal link
      paypal.me/VincentRusso1
      for donations that go directly to the creation of content on this channel.
      I hope that the content I provide there will enhance the videos on my RUclips page.
      bit.ly/lp_email

  • @ashkanajrian830
    @ashkanajrian830 4 года назад

    Great example! I have almost faced up with the same situation where I expected to speed up my execution time using Multiprocessing however I got longer run time! Thanks.

  • @schogaia
    @schogaia 5 лет назад +1

    Excellent video, greetings from Germany

    • @LucidProgramming
      @LucidProgramming  5 лет назад +1

      Thank you! If you like my content, I've been working on some projects during the past couple of months. If you would like to stay up-to-date, please consider subscribing to my mail list. Also, if you haven't already, please consider subscribing!
      I really appreciate the support, and if you want to support the channel, I do have a PayPal link (www.paypal.me/VincentRusso1) for donations that go directly to the creation of content on this channel.
      I hope that the content I provide there will enhance the videos on my RUclips page.
      bit.ly/lp_email

    • @schogaia
      @schogaia 5 лет назад +1

      @@LucidProgramming all set: subscribed to your channel, donated and subscribed to your mailing list

    • @schogaia
      @schogaia 5 лет назад

      Oh, what I forgot to mention: if you ever run out of ideas for topics I would be really interested into a series about object oriented programming. I get the point of what oop can be used for, but I cannot think of any usecase why I (as someone who is into python for less than a year) would ever want to use oop

    • @LucidProgramming
      @LucidProgramming  5 лет назад +1

      @@schogaia Wow, thank you so much! I sincerely appreciate all forms of your support! That really means the world to me to have support from great viewers like yourself and encourages me to continue (hopefully :P) helping others and making content. Thank you again for your patronage!

    • @LucidProgramming
      @LucidProgramming  5 лет назад +1

      @@schogaia Noted! I think this is a great idea for a series of videos. Once I get my recording setup back in proper order, I think I will add this one to the list. Again, thank you for the suggestion!

  • @VoiceOfAsh
    @VoiceOfAsh Год назад

    Awesome video as usual!

  • @rafaeltsuhafachini3763
    @rafaeltsuhafachini3763 4 года назад +1

    great example and explanation bro!

    • @LucidProgramming
      @LucidProgramming  4 года назад +1

      Thank you! If you like my content, I've been working on some projects during the past couple of months. If you would like to stay up-to-date, please consider subscribing to my mail list. Also, if you haven't already, please consider subscribing!
      I really appreciate the support, and if you want to support the channel, I do have a PayPal link
      paypal.me/VincentRusso1
      for donations that go directly to the creation of content on this channel.
      I hope that the content I provide there will enhance the videos on my RUclips page.
      bit.ly/lp_email

  • @Optisoins
    @Optisoins 5 лет назад +1

    Once again ! Very interesting video ! Thank you very much

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      Really appreciate your comments! I'm glad to hear you've enjoyed this series!

  • @chmaguire79
    @chmaguire79 4 года назад +1

    great examplwe - thanks!

    • @LucidProgramming
      @LucidProgramming  4 года назад

      Thank you! If you like my content, I've been working on some projects during the past couple of months. If you would like to stay up-to-date, please consider subscribing to my mail list. Also, if you haven't already, please consider subscribing!
      I really appreciate the support, and if you want to support the channel, I do have a PayPal link
      paypal.me/VincentRusso1
      for donations that go directly to the creation of content on this channel.
      I hope that the content I provide there will enhance the videos on my RUclips page.
      bit.ly/lp_email

  • @RohunTripati
    @RohunTripati 5 лет назад +1

    Nice work!!

  • @chuanjiang6931
    @chuanjiang6931 4 месяца назад

    Under what condition do you recommend using Pool as opposed to Process?

  • @novianindy887
    @novianindy887 2 года назад +1

    what about compared to threading?
    which is faster and whats the difference

  • @talbarak8861
    @talbarak8861 4 года назад

    It is work mentioning that it isn't possible to call p.join() before calling p.call(). At least it isn't possible, based on the official Python documentation.

  • @summerxia7474
    @summerxia7474 2 года назад

    Thank you so much for your demonstration! Very clear and helpful. Can I ask why do we need the line "if __name__ == '__mian__' "? Thank you!

    • @bigsmoke6414
      @bigsmoke6414 2 года назад

      because each process basically calls the file in wich this function is in and executes that function. However, if you call a function from another file, it also does everything outside of the function that is being called (except for stuff in other functions). So with that "if", each process would also again start more processes and endless processes would be spawned

  • @geperudeta1995
    @geperudeta1995 4 года назад +1

    thank you!

    • @LucidProgramming
      @LucidProgramming  4 года назад

      Cheers! If you enjoyed and benefited from my content, please consider liking the video and subscribing to the channel for more content like this. If you would like to support the content creation on this channel please consider unblocking ads when watching my videos as this is how I support my time to make content. I hope to be putting out more similar videos soon!

  • @ighsight
    @ighsight 2 года назад

    Excellent. I'm trying to implement this along with a timeout that will kill processes that are running longer than a set time. If anyone knows of a video that shows how to do that please pass it on.

  • @박동연-t9w
    @박동연-t9w 5 лет назад +2

    Thanks for the great contents. I'm really learning a lot from these videos. I have a question, is there a smart way to give pool class multiple arguments? For example, in this video, function 'sum_square' takes 1 integer as an argument and to execute parallel computation we make a list of integers[numbers] and use 'p.map(sum_square, numbers)' . What I see is that the map function takes the function to parallelize and then a list of arguments. But for a function that takes multiple arguments other than only one, lets say for 2 integers, how do you design the pool multiprocessing?

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      Thank you! If you like my content, I've been working on some projects during the past couple of months. If you would like to stay up-to-date, please consider subscribing to my mail list. Also, if you haven't already, please consider subscribing!
      I really appreciate the support, and if you want to support the channel, I do have a PayPal link (www.paypal.me/VincentRusso1) for donations that go directly to the creation of content on this channel.
      I hope that the content I provide there will enhance the videos on my RUclips page.
      bit.ly/lp_email
      And w.r.t. your question, checkout the "functools" module. Cheers!

    • @JWNam98
      @JWNam98 5 лет назад

      I'm having the same issue !

    • @JWNam98
      @JWNam98 5 лет назад

      This may not be a satisfying solution, hope it helps
      python.omics.wiki/multiprocessing_map/multiprocessing_partial_function_multiple_arguments

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      @@JWNam98 Did you look into using "functools"? That should solve your problem.

    • @miqoo1996
      @miqoo1996 Год назад

      This is the way on how you can do that:
      from multiprocessing import Pool
      def multiply(a, b):
      return a * b
      if __name__ == '__main__':
      args_list = [(2, 3), (4, 5), (6, 7)] # a list of argument tuples
      with Pool(processes=2) as pool:
      results = pool.starmap(multiply, args_list)
      print(results)

  • @mukund4life
    @mukund4life 4 года назад +2

    well explained :)

  • @KARAB1NAS
    @KARAB1NAS 5 лет назад

    cool video - what is the diffeerence with using 'Process' instead of ''Pool'?

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      Thanks for watching! W.r.t. your question, I honestly think Google has some great answers on these differences. Far better than anything I can convey in a RUclips comment. Thanks again for watching, and have a nice day!

  • @phillipotey9736
    @phillipotey9736 3 года назад

    Nice, one comment, you can use double the process of the number of cores so 16 cores is 32 process you can run.

  • @LinhHoang-zi9mt
    @LinhHoang-zi9mt 5 лет назад +1

    I like your vim setup. How can I get the style/customization like yours? Thanks.

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      Thanks! I have a series on this that can be found here: bit.ly/lp_vim. Cheers!

    • @LinhHoang-zi9mt
      @LinhHoang-zi9mt 5 лет назад

      ​@@LucidProgramming Just to let you know that your video is super useful. I dig further into the documentation and figure out how to send multiple arguments. Mucho gracias!! stackoverflow.com/questions/5442910/python-multiprocessing-pool-map-for-multiple-arguments

  • @maurcd
    @maurcd 3 года назад

    Hi! Is there anyway to use a multiprocessing within a function?
    Specifically I'd like to know if for example it is possible to define a function "sum_square_with_mp(numbers)", where if a run the code, then I would be able to type sum_square_with_mp([1,3,5]) and get 1, 14 and 55.

    • @LucidProgramming
      @LucidProgramming  3 года назад

      Yeah, I would say check the docs for multiprocessing.

  • @HeadphoneYT
    @HeadphoneYT 4 года назад

    How to explicitily limit the number of process , like I want to run only 3 process at a time .

  • @paulosergioschlogl9550
    @paulosergioschlogl9550 4 года назад

    how can I use pool to count words in a huge file (genome = dna) to speed up my results.
    for ex if I look for palindrome words in a text and need to make a dict (word:[list positions]) it takes a 2 minuts and I have 198000 text to look for. 8)
    I was passing the arguments like alphabet = 'acgt' and k = integer representing the length of the words to look for.
    i was using partial to pass the text to the function and then pool.apply_async(func, args). But still not good performance at all. do you have any suggestions?

    • @LucidProgramming
      @LucidProgramming  4 года назад +1

      A lot of that would depend on some of the specifics of the problem. It's hard for me to offer anything that doesn't come off as super general. I have done work with clients who have had large datasets for biostatistics, etc. If you think you could benefit from some outside help, I do offer such services if you want to PM me. Hope that helps, cheers!

  • @socialrupt8040
    @socialrupt8040 3 года назад

    The code at 5:00 gives me this output:
    [0, 0, 1, 5, 14]
    Traceback (most recent call last):
    File "main.py", line 17, in
    p.join()
    File "/usr/lib/python3.8/multiprocessing/pool.py", line 659, in join
    raise ValueError("Pool is still running")
    ValueError: Pool is still running

  • @utayasurian419
    @utayasurian419 3 года назад

    How to solve broken process pool on multiprocessing?

  • @sdda1592
    @sdda1592 5 лет назад +1

    what is the terminal that you are using?

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      You mean what editor am I using? It's vim. I have a whole series on it you can check out here: bit.ly/lp_vim

  • @gametester905
    @gametester905 4 года назад

    What should I do if my function needs two arguments? What would be the correct syntax? Should I create a list of two arguments on each element? result = p.map(function, (arg1,arg2)) ??

    • @LucidProgramming
      @LucidProgramming  4 года назад

      Did you try this?

    • @gametester905
      @gametester905 4 года назад

      @@LucidProgramming Yes. Let's suppose that I want to multiply two numbers. When I use this for two sets (four numbers), it always asks for the second argument. I don't know what I can do to solve it. I tried many ways to write the arguments.

    • @LucidProgramming
      @LucidProgramming  4 года назад

      @@gametester905 Just put a comma after the last argument.

  • @saber291996
    @saber291996 4 года назад

    Is it possible to map the function on gpu instead of cpu? Thank you very much for the tutorial.

    • @LucidProgramming
      @LucidProgramming  4 года назад +1

      Hi Nicky. The multiprocessing module has to do with the cores on your machine, not on your GPU, so this would not work as a one-to-one mapping.

    • @saber291996
      @saber291996 4 года назад +1

      @@LucidProgramming Get it. Many thanks.

    • @LucidProgramming
      @LucidProgramming  4 года назад +1

      @@saber291996 Great. Thanks again for the comment! :)

  • @manjum1483
    @manjum1483 4 года назад

    I would like to understand if my VM is having only 10 core, and if I allocate 200, what will happen and how does p.map( fun, [1....300]) , fucntion work if I'm calling function more than number core and args are more number of cores? thank you

  • @ChrisUhlik
    @ChrisUhlik 4 года назад

    os.cpu_count() returns the number of virtual cores. On a 4-core, hyper-threading machine, it returns 8. However, python cannot efficiently run two interpreters on a single hyper-threaded core as the processes have separate memory spaces but the hyperthreaded core halves share a single memory space. To really operate at maximum efficiency, you should separate your code into a pool of 4 processes, each of which consisted of 2 threads running on a single interpreter. Thus the thread pairs could run concurrently taking advantage of overlapping IO and computation for example, while the 4 cores could each be doing some computationally intensive operation. Too much detail for this level of tutorial, but you might have significant savings for large memory processes by avoiding the default argument to Pool() and instead do Pool(os.cpu_count()/2) or Pool(psutil.cpu_count(logical = False))

    • @ChrisUhlik
      @ChrisUhlik 4 года назад

      Quick example on my machine: numbers=range(10000) Pool() generated 8 processes and the times were 1.016 and 3.289 seconds so multiprocessing was a bit less than 4x faster. And if I run again using Pool(4) it generates 4 processes and the times were 1.013 and 3.220 seconds, so actually a bit faster with less context switching even though I was using half as many processes.

  • @marwanfouad5781
    @marwanfouad5781 5 лет назад +1

    Thanks a lot for this rich content, I really learned a lot from it. I have a question: What is the difference between Pooling and using the classical Process modules as in the previous videos?

    • @LucidProgramming
      @LucidProgramming  5 лет назад +1

      Thank you! If you like my content, I've been working on some projects during the past couple of months. If you would like to stay up-to-date, please consider subscribing to my mail list. Also, if you haven't already, please consider subscribing! I really appreciate the support. I hope that the content I provide there will enhance the videos on my RUclips page.
      bit.ly/lp_email
      Regarding your question, I think this post does a nice coverage of the two comparatively. Thanks again for watching!
      www.ellicium.com/python-multiprocessing-pool-process/

    • @marwanfouad5781
      @marwanfouad5781 5 лет назад +1

      @@LucidProgramming Thank you very much for your reply. I already subscribed to your channel wishing you all the best.

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      @@marwanfouad5781 Thank you kindly, Marwan! I appreciate your support :). Wishing you the best as well! Cheers!

  • @amitbeniwal1
    @amitbeniwal1 4 года назад

    Hi, I have code which interacts with network devices and takes some commands output from the devices and processes that output to take useful data and writes to a csv file. Now in the code i want to introduce multiprocessing but the problem is that the main() function calls the other functions and these functions also call other local functions as per requirement. Can you please help me out here as I am really confused on how to write the same code with multiprocessing because this operation is done on more than 100 devices and consumes about an hour to complete.

    • @LucidProgramming
      @LucidProgramming  4 года назад

      It's hard for me to help as the details you've provided are a bit sparse. I think there would be a lot of unique things specific to your situation that I would not be aware of.

  • @talbarak8861
    @talbarak8861 4 года назад

    How the Pool object differs from ProcessPoolExecutor?

    • @LucidProgramming
      @LucidProgramming  4 года назад

      You can try Googling for this. I the answer you would find would be better than something I can provide here in a comment.

  • @dwiagungahmad2446
    @dwiagungahmad2446 3 года назад

    This work for py3 or no ???

  • @aanchalagarwal6886
    @aanchalagarwal6886 4 года назад

    While using Multiprocessing Pool, I am getting a permission denied error. Can you help me with that

    • @LucidProgramming
      @LucidProgramming  4 года назад

      sudo?

    • @aanchalagarwal6886
      @aanchalagarwal6886 4 года назад

      @@LucidProgramming I am working on a Windows Machine, and I am already using a Virtual Environment

    • @LucidProgramming
      @LucidProgramming  4 года назад

      @@aanchalagarwal6886 Google?

    • @talbarak8861
      @talbarak8861 4 года назад +1

      @@aanchalagarwal6886 Not sure if this is related, but in Windows, you must have your processes creation code inside: "if __name__ == "__main__":
      "

  • @sanjaykrish8719
    @sanjaykrish8719 4 года назад

    multiprocessing.cpu_count() finds the number of cores

  • @NathanBinks
    @NathanBinks 5 лет назад

    I was using this code to understand what the code does and if it's converted into a exe on Windows, It'll eat all of your RAM in 1-5 seconds and create over 1000 processes with a system crash. (I saw the code create 5000+ at one time.)

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      Yep, it'll consume all your resources. Thought I mentioned a caution against that, but I didn't, I apologize!

  • @francislapena8818
    @francislapena8818 5 лет назад

    import multiprocessing as mp
    mp.cpu_count()

    • @LucidProgramming
      @LucidProgramming  5 лет назад

      What is this comment in reference to? Am I missing something?

  • @Didanihaaaa
    @Didanihaaaa 5 лет назад

    Thanks for your video. I do have a question about multiprocessing. Here I asked it on StackOverflow.
    stackoverflow.com/questions/57529308/solving-a-problem-utilizing-multiprocessing-and-value-in-python
    I appreciate any help.
    Solving the same problem using the new approach provided in your video, still, I get no benefits in term of speed.