Want Faster HTTP Requests? Use A Session with Python!
HTML-код
- Опубликовано: 9 фев 2021
- This is why I think you should use a http session when web scraping with python. It comes with many benefits and lets us access more features within the requests library, the most common and used Python library for http requests. In this video we look at the connection pooling that is allows us to access to speed up our code when sending requests to the same server. This is perfect for scraping data or accessing APIs.
- Наука
The wonders of internet in Australia, I am getting 46 seconds without and 38 with. haha. Good stuff, thanks John.
Another great video covering an important topic for efficient web scraping. Thank you John!
Another fantastic vid! Short, succinct and very useful!
I'll be applying this sessions thing to some of my code very shortly! As always, thanks for the great content.
Thank you!
Best vid on Requests.Session I've seen. Practical and very helpful
Oh Jeesh. First day using the Session and it made a HUGE difference in performance (14 seconds for what used to take 2 minutes!). Thanks for covering this topic so well.
Great to hear!
As always excellent work! Thank you so much to share your insights
Thank you so much! I was looking for quick tutorial on keep-alive type of connections using requests library and this video solves my problem.
Thanks again, John! Workwise I am "implementing" two different APIs'. They have rate limits and there is there case where I need to slice the input args and send multiple requests. This is how I will do it.
Nice short video to elaborate the efficiency of session object. 💖
I was pretty amazed with Session performance, thanks a lot.
John, you are really cool. Your videos help me a lot!
i really appreciate your lessons, the best on youtube
Thank you so much for this Awesome tips John!
i was reading about sessions for 2 days and still couldn't understand the concept and just 7:15 minutes and i got a whole new level of knowledge... Thanks for the video buddy... really helped me...
No worries glad to help!
An awesome trick that will make difference certainly.
The Best Web Scraping Channel on RUclips. He Teaches Each Topic so Clear and is very easy to Follow. Highly Recommend this Channel to anyone who wants to Learn Web Scraping the RIght Way. Keep it up John
Thank you very kind!
Thanks for your great effort, you are the teacher,
with my regards , waleed
Great video. Thanks for this.
Nice to know..such a difference..
amazing video, helped out a lot
Wow.
Thank you bro!
Amazing, you should do a video for scraping on a cloud flare website, but thank you
Thanks, I went from 500 requests every 87 secconds to 500 requests every 33 secconds which is like the maxinum amount I can do considering the ratelimit
this is life saving!!!
thats very helpful, thanks
Excellent.
Another good one...cheers!
Thank you! Cheers!
Hey John,, thanks for the beautiful contents firstly
Hey! Generally no, you should always use a session, it won’t speed it up significantly enough to cause blocking
great video :)
Super helpful tip
Thanks! Check out my latest video if you liked this one much more on using the session
Very interesting
Maybe it is stupid question, but please bear with it as I am quite new to this. Will requests.Session() keep the session alive if I provide cookies and user agent (headers) to my requests, for example s.get(url, headers=headers, cookies=cookie) ? Will it still speed up my process? thank you for kind souls who will answer this question in advance. And thank you for the video too!
Nice video dude! Now my default way is session.get() instead of requests.get() hahaha
Great video, i tried it worked like a charm. Thank you. But when I do session.close and retry it says connection reset by peer. How to completely close the session?
My code improved by 75 %
Or you can do that:
from concurrent.futures import ThreadPoolExecutor
def FetchAllListing(page):
print(page)
r = requests.get(f"rei.com/c/backpacks?json=true&page={ page }")
with ThreadPoolExecutor(max_workers=30) as e:#Threads
e.map(FetchAllListing, range(50) )#Amount of pages to scrap
Nice
Good idea
Can we apply this to optionchain of Indian Stock Exchange(NSE) so that I can transfer the data to excel.
Here we go
Very nice, can you make video on how to scrap LinkedIn Jobs based on profession and salary using scrapy
is there a way to get data from website has cloudflare ddos protection
I want to exit or close my session after my code complete its work. How to close the session at end of the code execution
Can you show us how to handle required cookies by a site?
Hi can requests do sign-ups? I know it can do logins but what if we can send the sign up data thru post params.. besides the recaptcha token us a problem. Can you do a video to pass that kind of payload with recaptcha token
I love how you just slipped in 'pip install requests' as a text overlay in the beginning of the video 😂
😅 thanks
John, in your opinion, which method works faster, session or concurrent.futures or asyncio ???? speed is one of the most important factors in this work.🤔🤔🤔🤔🤔🤔🤔
Hi, i could get crsftoken from the cookies with Session for logging in Memrise website in 2021. But I don’t know why it doesn’t work the curent days. Could you please give me any advise?
Hi .when can one expect new live web scraping with python ?
If you use gettitle_session(s , x) isn't s at this point a session object? So is there a reason for not passing it as an argument to the function v referencing globally?
Yes it is, we give the function out session object to use - you need to reference it outside the function though so every request uses the same session to make it cleaner and a bit faster
Nice video! How do we close this session without using the "with" keyword?
Session.close() I believe!
Pls do scraping to for the Delta airlines website
Does the session literally means creating application session in server then append session-cookies in every of its get request?
or it means TCP session via Keep-Alive connection?
I mean the requests.Session()
the requests library is working slow in my laptop (80 secs), when i run the same code it works fast in macbook (0.5 secs). What could be the issue ?
when using pytest, can the session be persisted in the test, when the session had been created in a conftest fixture?
My initial thoughts are yes, I can see that you could do so within the same session but it’s not something I’ve done before, my experience with pytest is limited
Use pypy3, it’s what I use along with golang
Would you still recommend session over httpx?
Httpx has its own session object too
ly bro
Can session used for 504 Gateway timeout error in Python API?
No a 504 is a server error not a client one so it won’t solve that issue
Unfortunately, it did not work for me. With or without session, the running time is the same... :-(
Really? Ok fair enough. To go faster you’ll need to learn about async and await
Don't you have to close the session every time??
Why my session is taking more time than normal
I prefer time.time() for calculating time elapsed. Shorter.
HELP !!
I am using pythons request lib and I want to get data from an API .. so I write down a program and run it on my local computer it just run perfectly .. but when I try to like host the code in web like heroku or in replit it responds with status code 200 but not providing data inside the api... I tried to change the user agent as a browser agent but it doesn't worked.. can anyone help me out..
A 200 response could still be a captcha page or similar - try checking to what information is coming back, however it’s likely being blocked as your now on an IP shared with lots and lots of other people
@@JohnWatsonRooney It's return a json string with two keys like message : ok & Type : ok in webhost but when I try the exactly same code with same bearer token and headers on local computer it provides the entire json file with all the data ... How can I fix this ...
Is it an api with an actual key or from web scraping?
@@JohnWatsonRooney It's an API with authentication token and headers
That’s odd. Im not sure, sorry!
Please make video on regex
Good idea
HTMLSession is better than requests.Session()
hello. thank you so much for this wonderful tip.
I would like to use this session.
My .py similar to your 'Web Scraping with Python: Ecommerce Product Pages. In Depth including troubleshooting' video .py
How I can add in this codes? there are 2 requests. first for find links, second for products data.
You forget to close the session that you created.
Bad practice.
Instead you should use a python context...
use gevent or muti threading are super more quicker than session
Clearly haven’t heard of multithreading eh?
Nice short video to elaborate the efficiency of session object. 💖