Hey guys! If you have any idea's about websites that you would like us to show you how to scrape, please let us know! Oh and what programing language/framework too - we will be branching out into videos for scraping with node.js and other languages too :)
How can I modify it after login in that it will find all possible url and follow them until all found or depth limit and will return all extracted url in an array? And how can i start your code like c = basic_login_spider.CrawlerProcess({})
how can we store and pass cookies to the next_function to parse product details page ? tried to include cookies_dict = {cookie['name']: cookie['value'] for cookie in response.data['cookies']} and cookies=cookies_dict in scrapy.Request but received following error: cookies_dict = {cookie['name']: cookie['value'] for cookie in response.data['cookies']} AttributeError: 'HtmlResponse' object has no attribute 'data'
The easiest thing is to create a new variable: "session_cookies = {}" under the spider name github.com/python-scrapy-playbook/scrapy-login-spiders/blob/main/scrapy_login_spider/spiders/headless_browser_login_spider.py#L41 and then you can save the cookies into that variable with "self.session_cookies = {cookie['name']: cookie['value'] for cookie in response.data['cookies']}" on this line: github.com/python-scrapy-playbook/scrapy-login-spiders/blob/main/scrapy_login_spider/spiders/headless_browser_login_spider.py#L62. Then any time you need the cookies you can pass them into the scrapy.request function with: "yield scrapy.Request(url=url, cookies=self.session_cookies, callback=self.parse)"
Hey guys! If you have any idea's about websites that you would like us to show you how to scrape, please let us know! Oh and what programing language/framework too - we will be branching out into videos for scraping with node.js and other languages too :)
Hello I need help I'm stucked
Thanks - well explained, appreciate you sharing.
How can I modify it after login in that it will find all possible url and follow them until all found or depth limit and will return all extracted url in an array?
And how can i start your code like c = basic_login_spider.CrawlerProcess({})
thanks a lot !
how can we store and pass cookies to the next_function to parse product details page ?
tried to include cookies_dict = {cookie['name']: cookie['value'] for cookie in response.data['cookies']}
and cookies=cookies_dict in scrapy.Request but received following error:
cookies_dict = {cookie['name']: cookie['value'] for cookie in response.data['cookies']}
AttributeError: 'HtmlResponse' object has no attribute 'data'
The easiest thing is to create a new variable: "session_cookies = {}" under the spider name github.com/python-scrapy-playbook/scrapy-login-spiders/blob/main/scrapy_login_spider/spiders/headless_browser_login_spider.py#L41 and then you can save the cookies into that variable with "self.session_cookies = {cookie['name']: cookie['value'] for cookie in response.data['cookies']}" on this line: github.com/python-scrapy-playbook/scrapy-login-spiders/blob/main/scrapy_login_spider/spiders/headless_browser_login_spider.py#L62. Then any time you need the cookies you can pass them into the scrapy.request function with: "yield scrapy.Request(url=url, cookies=self.session_cookies, callback=self.parse)"
@@scrapeops tested, work like a charm, thanks! ❤❤❤
change password you have displayed in video 😁
bro copying a file and deleting entire content from the file.... instead could just create a new file na