How to Login with Python Scrapy (2022)

Поделиться
HTML-код
  • Опубликовано: 22 дек 2024

Комментарии • 10

  • @scrapeops
    @scrapeops  2 года назад +1

    Hey guys! If you have any idea's about websites that you would like us to show you how to scrape, please let us know! Oh and what programing language/framework too - we will be branching out into videos for scraping with node.js and other languages too :)

  • @huwjarse5761
    @huwjarse5761 2 года назад +2

    Thanks - well explained, appreciate you sharing.

  • @BrokenWorlds
    @BrokenWorlds 2 года назад +1

    How can I modify it after login in that it will find all possible url and follow them until all found or depth limit and will return all extracted url in an array?
    And how can i start your code like c = basic_login_spider.CrawlerProcess({})

  • @nodariasatiani4997
    @nodariasatiani4997 2 года назад +1

    thanks a lot !

  • @nodariasatiani4997
    @nodariasatiani4997 2 года назад

    how can we store and pass cookies to the next_function to parse product details page ?
    tried to include cookies_dict = {cookie['name']: cookie['value'] for cookie in response.data['cookies']}
    and cookies=cookies_dict in scrapy.Request but received following error:
    cookies_dict = {cookie['name']: cookie['value'] for cookie in response.data['cookies']}
    AttributeError: 'HtmlResponse' object has no attribute 'data'

    • @scrapeops
      @scrapeops  2 года назад +1

      The easiest thing is to create a new variable: "session_cookies = {}" under the spider name github.com/python-scrapy-playbook/scrapy-login-spiders/blob/main/scrapy_login_spider/spiders/headless_browser_login_spider.py#L41 and then you can save the cookies into that variable with "self.session_cookies = {cookie['name']: cookie['value'] for cookie in response.data['cookies']}" on this line: github.com/python-scrapy-playbook/scrapy-login-spiders/blob/main/scrapy_login_spider/spiders/headless_browser_login_spider.py#L62. Then any time you need the cookies you can pass them into the scrapy.request function with: "yield scrapy.Request(url=url, cookies=self.session_cookies, callback=self.parse)"

    • @nodariasatiani4997
      @nodariasatiani4997 2 года назад +1

      ​@@scrapeops tested, work like a charm, thanks! ❤❤❤

  • @awaisahmad5908
    @awaisahmad5908 Год назад +1

    change password you have displayed in video 😁

  • @abhishekk3561
    @abhishekk3561 Год назад

    bro copying a file and deleting entire content from the file.... instead could just create a new file na