Python Selenium Tutorial #10 - Scrape Websites with Infinite Scrolling

Поделиться
HTML-код
  • Опубликовано: 29 окт 2024

Комментарии • 26

  • @narkornchaiwong9114
    @narkornchaiwong9114 Год назад +1

    is web login & password and google Authenticator for selenium ? is python create from input for login website page ... result can't load from a selenium

  • @Yaser-ih2cx
    @Yaser-ih2cx 5 месяцев назад

    I can't find the code for this video in your github link.

  • @Faybmi
    @Faybmi 6 месяцев назад +1

    Is it possible to start parsing right away?
    with the fiftieth element and not start parsing everything again?

    • @MichaelKitas
      @MichaelKitas  5 месяцев назад

      It wouldn't matter, we replace old scraped values with old + new ones each time. Is there a reason you want to start specifically where you left of? (Performance wise it doesn't matter)

  • @pineappily3119
    @pineappily3119 10 месяцев назад +1

    Hi I am having a doubt! You code works very well, but when I scrap, the data gets scraped from the start after some time. Is there any way for it?

    • @MichaelKitas
      @MichaelKitas  10 месяцев назад

      Yeah, you should put an if statement to check if the amount you scraped is the same amount you currently saved, if so then stop the script

    • @pineappily3119
      @pineappily3119 10 месяцев назад

      @@MichaelKitas Actually it didn't scrap everything. It just scrapes everything from the start again. But I got it solved it. Thanks

  • @emphieishere
    @emphieishere 8 месяцев назад +1

    Thanks for a great video! Could you tell please, I just dont get it. Why should we update the items list every time instead of appending to it? Because I've tried to see how instagram behaves and it seems like everytime it scrolls down it loads an exact set of items and deletes the previous ones out of the code. Or am i being mistaken?

    • @MichaelKitas
      @MichaelKitas  8 месяцев назад

      Because we would have duplicates each time we append since when new items are loaded we also get the old items in there.

  • @yafethtb
    @yafethtb 2 года назад +1

    How about appending element.text directly to items list instead of updating items list with textElements list? Or is it each time Selenium scroll the page, it will scrape all over again all of the previous element.text? If that's the case, what if we use set instead of list to contain the result, so it will be only the unique result we keep?

    • @MichaelKitas
      @MichaelKitas  2 года назад

      It scrapes all over again, correct. You can try set, I am not sure what the difference is 👍

    • @yafethtb
      @yafethtb 2 года назад +1

      @@MichaelKitas Ah, I see. I assume they will just scraping the current page after scrolling, but it seems it's not work like that. Thanks for the info.

    • @yafethtb
      @yafethtb 2 года назад

      Then it might be better to scroll the page till the end of page and then scraping all the content? By doing this we don't have to updating the list.

    • @MichaelKitas
      @MichaelKitas  2 года назад

      @@yafethtbThat’s a bad practice, as some pages like Facebook Marketplace never have an ending and by the time they do you ram will overload and you will never get any data

  • @ronny584
    @ronny584 2 года назад +1

    For some reason my website can't load from a selenium scroll, it just stucks there.

    • @MichaelKitas
      @MichaelKitas  Год назад

      What do you mean? It doesn't scroll?

  • @huey-nibiru
    @huey-nibiru Год назад +1

    great video thanks for the help

  • @adamsteklov
    @adamsteklov Год назад +1

    nah, nothing work. browser just closing before scroll to page 2

    • @MichaelKitas
      @MichaelKitas  Год назад

      It’s not that the method doesn’t work, you either have an error and the browser is crashing or you are closing browser too soon

    • @adamsteklov
      @adamsteklov Год назад +1

      @@MichaelKitas solved with albums?page=* . Infiniti scrolling have pages

  • @RealEstate3D
    @RealEstate3D 2 года назад +1

    In my use case the first items disappear as new items are loaded, which makes sense for an application to not crash the RAM. In these cases unfortunately this wouldn`t be a solution.

    • @MichaelKitas
      @MichaelKitas  2 года назад

      Why not? Just save the items and every time you scrape new items just append them to an array or json file

    • @gomebenmoshe832
      @gomebenmoshe832 Год назад

      Did you ever solve this? I have the same problem

  • @anurajms
    @anurajms 2 года назад +1

    thank you