Want Faster HTTP Requests? Use A Session with Python!

Поделиться
HTML-код
  • Опубликовано: 9 фев 2021
  • This is why I think you should use a http session when web scraping with python. It comes with many benefits and lets us access more features within the requests library, the most common and used Python library for http requests. In this video we look at the connection pooling that is allows us to access to speed up our code when sending requests to the same server. This is perfect for scraping data or accessing APIs.
  • НаукаНаука

Комментарии • 94

  • @ugurdev
    @ugurdev 3 года назад +7

    The wonders of internet in Australia, I am getting 46 seconds without and 38 with. haha. Good stuff, thanks John.

  • @celerystalk390
    @celerystalk390 3 года назад +4

    Another great video covering an important topic for efficient web scraping. Thank you John!

  • @pr0skis
    @pr0skis 3 года назад +10

    Another fantastic vid! Short, succinct and very useful!
    I'll be applying this sessions thing to some of my code very shortly! As always, thanks for the great content.

  • @cetilly
    @cetilly 3 года назад +1

    Best vid on Requests.Session I've seen. Practical and very helpful

  • @cetilly
    @cetilly 3 года назад +5

    Oh Jeesh. First day using the Session and it made a HUGE difference in performance (14 seconds for what used to take 2 minutes!). Thanks for covering this topic so well.

  • @GordonShamway1984
    @GordonShamway1984 3 года назад +5

    As always excellent work! Thank you so much to share your insights

  • @nikhil182
    @nikhil182 Год назад +1

    Thank you so much! I was looking for quick tutorial on keep-alive type of connections using requests library and this video solves my problem.

  • @multigladiator384
    @multigladiator384 2 года назад +2

    Thanks again, John! Workwise I am "implementing" two different APIs'. They have rate limits and there is there case where I need to slice the input args and send multiple requests. This is how I will do it.

  • @tubelessHuma
    @tubelessHuma 3 года назад +1

    Nice short video to elaborate the efficiency of session object. 💖

  • @nurshah816
    @nurshah816 2 года назад +1

    I was pretty amazed with Session performance, thanks a lot.

  • @katherinebaker3220
    @katherinebaker3220 2 года назад +1

    John, you are really cool. Your videos help me a lot!

  • @avkngeeks
    @avkngeeks 2 года назад +1

    i really appreciate your lessons, the best on youtube

  • @0xfsec
    @0xfsec 3 года назад +1

    Thank you so much for this Awesome tips John!

  • @abhijeetbonde8635
    @abhijeetbonde8635 3 года назад +2

    i was reading about sessions for 2 days and still couldn't understand the concept and just 7:15 minutes and i got a whole new level of knowledge... Thanks for the video buddy... really helped me...

  • @KhalilYasser
    @KhalilYasser 3 года назад +1

    An awesome trick that will make difference certainly.

  • @rakshithrajesh3938
    @rakshithrajesh3938 2 года назад +2

    The Best Web Scraping Channel on RUclips. He Teaches Each Topic so Clear and is very easy to Follow. Highly Recommend this Channel to anyone who wants to Learn Web Scraping the RIght Way. Keep it up John

  • @easyshopping7264
    @easyshopping7264 3 года назад +1

    Thanks for your great effort, you are the teacher,
    with my regards , waleed

  • @i701Dev
    @i701Dev 2 года назад

    Great video. Thanks for this.

  • @Klausi-uq4xq
    @Klausi-uq4xq 3 года назад +1

    Nice to know..such a difference..

  • @darrenhuang2600
    @darrenhuang2600 3 года назад +1

    amazing video, helped out a lot

  • @SkySesshomaru
    @SkySesshomaru 2 года назад +1

    Wow.
    Thank you bro!

  • @JesusTorres-bt2eb
    @JesusTorres-bt2eb 3 года назад +2

    Amazing, you should do a video for scraping on a cloud flare website, but thank you

  • @campbat5712
    @campbat5712 Год назад +3

    Thanks, I went from 500 requests every 87 secconds to 500 requests every 33 secconds which is like the maxinum amount I can do considering the ratelimit

  • @potatopc
    @potatopc 27 дней назад

    this is life saving!!!

  • @Niki14741
    @Niki14741 2 года назад +1

    thats very helpful, thanks

  • @mehdismaeili3743
    @mehdismaeili3743 Год назад +1

    Excellent.

  • @thatguy6664
    @thatguy6664 3 года назад +1

    Another good one...cheers!

  • @tanzimat2039
    @tanzimat2039 2 года назад +1

    Hey John,, thanks for the beautiful contents firstly

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +1

      Hey! Generally no, you should always use a session, it won’t speed it up significantly enough to cause blocking

  • @flimsy1417
    @flimsy1417 2 года назад +1

    great video :)

  • @alien9940
    @alien9940 Год назад +1

    Super helpful tip

    • @JohnWatsonRooney
      @JohnWatsonRooney  Год назад

      Thanks! Check out my latest video if you liked this one much more on using the session

  • @aogunnaike
    @aogunnaike 3 года назад +1

    Very interesting

  • @Achiesamablog
    @Achiesamablog 3 года назад +1

    Maybe it is stupid question, but please bear with it as I am quite new to this. Will requests.Session() keep the session alive if I provide cookies and user agent (headers) to my requests, for example s.get(url, headers=headers, cookies=cookie) ? Will it still speed up my process? thank you for kind souls who will answer this question in advance. And thank you for the video too!

  • @Ervilhav10
    @Ervilhav10 2 года назад +1

    Nice video dude! Now my default way is session.get() instead of requests.get() hahaha

  • @primestarchannel7221
    @primestarchannel7221 2 года назад

    Great video, i tried it worked like a charm. Thank you. But when I do session.close and retry it says connection reset by peer. How to completely close the session?

  • @tushargupta5890
    @tushargupta5890 8 месяцев назад +1

    My code improved by 75 %

  • @aiisnice1453
    @aiisnice1453 3 года назад +4

    Or you can do that:
    from concurrent.futures import ThreadPoolExecutor
    def FetchAllListing(page):
    print(page)
    r = requests.get(f"rei.com/c/backpacks?json=true&page={ page }")
    with ThreadPoolExecutor(max_workers=30) as e:#Threads
    e.map(FetchAllListing, range(50) )#Amount of pages to scrap

  • @lakshminarasimhanadimoorth3945

    Can we apply this to optionchain of Indian Stock Exchange(NSE) so that I can transfer the data to excel.

  • @gouemoregis195
    @gouemoregis195 3 года назад +1

    Here we go

  • @islamibrahim4735
    @islamibrahim4735 3 года назад

    Very nice, can you make video on how to scrap LinkedIn Jobs based on profession and salary using scrapy

  • @erdemgunal13
    @erdemgunal13 3 года назад

    is there a way to get data from website has cloudflare ddos protection

  • @welcometout
    @welcometout Год назад

    I want to exit or close my session after my code complete its work. How to close the session at end of the code execution

  • @axelamoe
    @axelamoe Год назад

    Can you show us how to handle required cookies by a site?

  • @HuskyTales2023
    @HuskyTales2023 3 года назад

    Hi can requests do sign-ups? I know it can do logins but what if we can send the sign up data thru post params.. besides the recaptcha token us a problem. Can you do a video to pass that kind of payload with recaptcha token

  • @glenn8781
    @glenn8781 2 года назад

    I love how you just slipped in 'pip install requests' as a text overlay in the beginning of the video 😂

  • @Automatic-show
    @Automatic-show Год назад

    John, in your opinion, which method works faster, session or concurrent.futures or asyncio ???? speed is one of the most important factors in this work.🤔🤔🤔🤔🤔🤔🤔

  • @quangjoseph8287
    @quangjoseph8287 2 года назад

    Hi, i could get crsftoken from the cookies with Session for logging in Memrise website in 2021. But I don’t know why it doesn’t work the curent days. Could you please give me any advise?

  • @maciekpaciarski9343
    @maciekpaciarski9343 3 года назад

    Hi .when can one expect new live web scraping with python ?

  • @DM-py7pj
    @DM-py7pj 3 года назад +1

    If you use gettitle_session(s , x) isn't s at this point a session object? So is there a reason for not passing it as an argument to the function v referencing globally?

    • @JohnWatsonRooney
      @JohnWatsonRooney  3 года назад

      Yes it is, we give the function out session object to use - you need to reference it outside the function though so every request uses the same session to make it cleaner and a bit faster

  • @pahadivillager5354
    @pahadivillager5354 3 года назад +1

    Nice video! How do we close this session without using the "with" keyword?

  • @manikandanmanickam9433
    @manikandanmanickam9433 3 года назад

    Pls do scraping to for the Delta airlines website

  • @novianindy887
    @novianindy887 Год назад +1

    Does the session literally means creating application session in server then append session-cookies in every of its get request?
    or it means TCP session via Keep-Alive connection?

  • @rtxmax8223
    @rtxmax8223 2 года назад

    the requests library is working slow in my laptop (80 secs), when i run the same code it works fast in macbook (0.5 secs). What could be the issue ?

  • @dalefixter
    @dalefixter 2 года назад +1

    when using pytest, can the session be persisted in the test, when the session had been created in a conftest fixture?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      My initial thoughts are yes, I can see that you could do so within the same session but it’s not something I’ve done before, my experience with pytest is limited

  • @kenkelvin4023
    @kenkelvin4023 3 года назад

    Use pypy3, it’s what I use along with golang

  • @hlrn4141
    @hlrn4141 5 месяцев назад +1

    Would you still recommend session over httpx?

  • @Christian-mn8dh
    @Christian-mn8dh Год назад

    ly bro

  • @indhumathi7391
    @indhumathi7391 2 года назад

    Can session used for 504 Gateway timeout error in Python API?

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      No a 504 is a server error not a client one so it won’t solve that issue

  • @nooral-firdaus7144
    @nooral-firdaus7144 2 года назад +1

    Unfortunately, it did not work for me. With or without session, the running time is the same... :-(

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      Really? Ok fair enough. To go faster you’ll need to learn about async and await

  • @PolSenserrich
    @PolSenserrich Год назад

    Don't you have to close the session every time??

  • @deepak2950
    @deepak2950 3 года назад

    Why my session is taking more time than normal

  • @mnsvcar9900
    @mnsvcar9900 10 месяцев назад

    I prefer time.time() for calculating time elapsed. Shorter.

  • @mr.strange7002
    @mr.strange7002 2 года назад +1

    HELP !!
    I am using pythons request lib and I want to get data from an API .. so I write down a program and run it on my local computer it just run perfectly .. but when I try to like host the code in web like heroku or in replit it responds with status code 200 but not providing data inside the api... I tried to change the user agent as a browser agent but it doesn't worked.. can anyone help me out..

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      A 200 response could still be a captcha page or similar - try checking to what information is coming back, however it’s likely being blocked as your now on an IP shared with lots and lots of other people

    • @mr.strange7002
      @mr.strange7002 2 года назад

      @@JohnWatsonRooney It's return a json string with two keys like message : ok & Type : ok in webhost but when I try the exactly same code with same bearer token and headers on local computer it provides the entire json file with all the data ... How can I fix this ...

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад

      Is it an api with an actual key or from web scraping?

    • @mr.strange7002
      @mr.strange7002 2 года назад +1

      @@JohnWatsonRooney It's an API with authentication token and headers

    • @JohnWatsonRooney
      @JohnWatsonRooney  2 года назад +1

      That’s odd. Im not sure, sorry!

  • @anikahmed7456
    @anikahmed7456 3 года назад +2

    Please make video on regex

  • @Shankar_Vasudevan
    @Shankar_Vasudevan Год назад

    HTMLSession is better than requests.Session()

  • @bunyaminsahiner9060
    @bunyaminsahiner9060 3 года назад

    hello. thank you so much for this wonderful tip.
    I would like to use this session.
    My .py similar to your 'Web Scraping with Python: Ecommerce Product Pages. In Depth including troubleshooting' video .py
    How I can add in this codes? there are 2 requests. first for find links, second for products data.

  • @ErikS-
    @ErikS- Год назад

    You forget to close the session that you created.
    Bad practice.
    Instead you should use a python context...

  • @hh3739
    @hh3739 3 года назад

    use gevent or muti threading are super more quicker than session

  • @kenkelvin4023
    @kenkelvin4023 3 года назад

    Clearly haven’t heard of multithreading eh?

  • @tubelessHuma
    @tubelessHuma 3 года назад

    Nice short video to elaborate the efficiency of session object. 💖