Parallelize Python Tasks with Joblib

Поделиться
HTML-код
  • Опубликовано: 14 ноя 2024

Комментарии • 39

  • @tomjohnson8986
    @tomjohnson8986 Год назад +5

    Straight to the point and clear. Well done. No fluff.

  • @thomasgoodwin2648
    @thomasgoodwin2648 2 года назад +8

    Careful of your benchmarking. The 1st time you run it single threaded, then after that all multi threaded. However, the 1st time you run it, it actually has to go out to the internet and actually download the images. Running after that is a bit of a cheat since it's likely that from then on, the images will be called up from cache rather than re-DLd.
    Just trying to make the point that when actually benchmarking, you need to average many runs, as initial conditions runs may not be indicative of 'normal' run times.
    It also illustrates a bit of the difficulty in benchmarking network programs.
    Many thanks as always. As usual you take "That's way above my brain limits' and transform it into 'Huh, So it's really just that easy.'

  • @gustavobertocco554
    @gustavobertocco554 2 года назад +2

    Awesome, this is exactly the type of pipeline I was looking for in order to integrate joblib :)

  • @rahulkumarsingh1716
    @rahulkumarsingh1716 Год назад

    With every videi , i learn something new! Good Job.

  • @viniciusfriasaleite8016
    @viniciusfriasaleite8016 2 года назад

    I want to apply parallel computing to the minimax algorithm I implemented on a connect 4 game. It looks like a really good improvement. Thanks for the content

  • @bw2
    @bw2 2 года назад +2

    well-timed video, multiprocessing module was giving me a hard time, I had to selectively scrape files (around 4K files each execution) I think this will be fantastic with HTTPX (drop-in replacement of requests module but with async support)

  • @JohnWalz97
    @JohnWalz97 2 года назад +5

    You should always use 'time.perf_counter' instead of 'time.time()' when you're trying to benchmark code. It's way more accurate.

    • @user-wr4yl7tx3w
      @user-wr4yl7tx3w 2 года назад +1

      I use %timeit. Is it better as well?

    • @JohnWalz97
      @JohnWalz97 2 года назад

      @@user-wr4yl7tx3w I think so but not completely sure. Since it's designed for timing code I would assume so but don't quote me on that 😅

  • @west
    @west 2 года назад

    Very useful overview, thank you!

  • @quick-info-101-p1p
    @quick-info-101-p1p 2 года назад

    Intro is amazing

  • @hamzarashid7579
    @hamzarashid7579 2 года назад +5

    You could use httpx instead of requests!

    • @alliedeena1141
      @alliedeena1141 2 года назад

      is it better than requests module?

    • @kezif
      @kezif 3 месяца назад

      @@alliedeena1141depends on what you need

  • @marcosoliveira8731
    @marcosoliveira8731 2 года назад

    I´ve learned a great deal here. Thank you.

  • @mohammedel2035
    @mohammedel2035 2 года назад

    Great value as always… Thanks a lot!

  • @shinrafahell
    @shinrafahell 2 года назад

    Excellent video!

  • @hopes3211
    @hopes3211 2 года назад

    Excellent content and easily explained. Awesome 👍

  • @splendorman7922
    @splendorman7922 2 года назад +1

    any advantages over 'unsync' library?

  • @alexanderrubioanasco5352
    @alexanderrubioanasco5352 2 года назад

    Thanks for share you aknowledge, i have a question,.
    how can i use parallelize if i am training a machine learning model? i use pyspark but, is parallelize better?

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 2 года назад

    Great video

  • @Y.Mahran
    @Y.Mahran 2 года назад

    Great video.

  • @Blendershick
    @Blendershick Год назад

    Amazing thank you!

  • @mdabdullahalhasib2920
    @mdabdullahalhasib2920 2 года назад

    Awesome

  • @PatrickSteil
    @PatrickSteil 11 месяцев назад

    That stupid easy.
    So does colors2 get returned with all the results from each run appended to a list?

  • @mallikarjunrterdal
    @mallikarjunrterdal 10 месяцев назад

    Multi processing with shared memory possible?

  • @giovannigaiardo
    @giovannigaiardo 2 года назад

    I am enjoying this channel very much. Would anyone recommend a similar one but focused on JavaScript?

  • @joshuabardwell2294
    @joshuabardwell2294 7 месяцев назад

    Why did your runtime change from n_jobs = 8 to -1?

  • @shuxiaokai
    @shuxiaokai 2 года назад

    So cool 🥰🥰🥰

  • @alejandrobravo1221
    @alejandrobravo1221 2 года назад

    What font family are you using in pycharm?

  • @RidingWithGerdas
    @RidingWithGerdas Год назад

    I am currently multithreading my web scraping project, opening multi browsers and clicking many things with Selenium. Would this improve anything in my case?

    • @HalfEatenMushroom
      @HalfEatenMushroom 11 месяцев назад

      Hey, I know it's a year later but Joblib is pretty much just a wrapper for either multiprocessing or mumtithreading depending on your specified preference

  • @sarimbinwaseem
    @sarimbinwaseem 2 года назад

    It is separating the input list and making function parallel..
    I want to speed up a function which is opening a xlsx file which take 22 seconds to open in openpyxl library.. Is thqt possible to speed up?

    • @SageBetko
      @SageBetko 2 года назад

      Try the read- or write-only optimized modes in openpyxl.

    • @sarimbinwaseem
      @sarimbinwaseem 2 года назад

      @@SageBetko Thanks.. I'll try.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 2 года назад

    Is it faster than JAX pmap?

  • @NimaqAlizadeh
    @NimaqAlizadeh 2 года назад

    👌👌🌹

  • @unknown-cj8gv
    @unknown-cj8gv 2 года назад

    You should be show only work and face except others things, this is thing reduce your efforts

  • @joshdheda8776
    @joshdheda8776 Год назад

    why use a stupid example?