Build undetectable Amazon scraper with n8n, Puppeteer and Scraping Browser

Поделиться
HTML-код
  • Опубликовано: 17 ноя 2024

Комментарии • 31

  • @workfloows
    @workfloows  Год назад +1

    Hello, thanks for watching my video! If you want to play a bit more with scraping using n8n and Puppeteer, here is my previous tutorial: ruclips.net/video/YonNJqAAxdg/видео.html

  • @JayeeeMaooo
    @JayeeeMaooo 8 месяцев назад

    thanks for teaching lesson start from the install package. it's so important!!! Keep doing it.

    • @workfloows
      @workfloows  8 месяцев назад

      Thank you very much - I’m glad you find my video helpful!

  • @SUDHANSHU934
    @SUDHANSHU934 Год назад +1

    I didn't like this video....I loved this ❤
    There are so many new things which I have learnt from this video today.
    By the awesome video editing....😍

    • @workfloows
      @workfloows  Год назад

      Thank you very much! I'm happy that you like the video and I'm very grateful for your support. All the best to you!

  • @pjones6749
    @pjones6749 10 месяцев назад +1

    Great video. Question: I am trying to reduct cloud costs for all these tools. How do you host Baserow, and do you host it on the same server as n8n? I now use Hetzner but not sure if it can handle both.

    • @workfloows
      @workfloows  10 месяцев назад

      Hey, thank you very much for your comment and kind words about my work - I really appreciate it!
      Yes, it is possible. Although I haven’t self-hosted Baserow yet (I use cloud option for now), I’m pretty sure it should not be a problem to have both n8n and other apps on the same VPS. The key here is to set each app on a different port. I suppose basic machines with ~2GB memory should handle both things.

    • @pjones6749
      @pjones6749 10 месяцев назад

      @@workfloows Thank you....Subscribed

  • @monitorleilao
    @monitorleilao 11 месяцев назад +1

    Congrats bro! 😃😃😃😃😃😃

  • @Falalfel
    @Falalfel 6 месяцев назад

    Why do you upload node.js code to Google cloud and not directly in node code in n8n workflow?
    Is it because you can't import modules? I get module not found error.
    Is there any others way?

  • @RolandoLopezNieto
    @RolandoLopezNieto 11 месяцев назад

    Great video

    • @workfloows
      @workfloows  11 месяцев назад

      Thank you very much!

  • @GehirnGoldmine
    @GehirnGoldmine Год назад

    What is the purpose of this scraped data? Obviously, there seem to be some use-cases. Would you please share some?

    • @workfloows
      @workfloows  Год назад

      Hello, thanks for you comment.
      Absolutely - mostly analytical ones, for example: monitoring of price changes, checking product fit and competitiveness, exploring customers preferences (basing on number of products sold), SEO analysis (e.g. what keywords competitors use) and many more similar.
      Basically, the key here is not only the type of data scraped, but also automation around it. In this example I scraped search results only for one keyword, but imagine you’d like to perform analysis for dozens or hundreds of product types - it’s also possible with this workflow.
      Scraped and structured data is also much easier to read and transform - it simplifies performing analytical tasks.

    • @GehirnGoldmine
      @GehirnGoldmine Год назад

      @@workfloows Hey, Thank you very much for your great answer! Makes a lot of sense to me.

  • @chowadagod
    @chowadagod Год назад

    lovely tutorial. i tried this but i'm getting a CORS error.. any idea how to resolve this.. thanks

    • @workfloows
      @workfloows  Год назад

      Hi, thanks for your comment and apologize for late feedback.
      Could you please let me know on which point you get this error (while running script locally or on GCF)? Do you get any other information from console?
      Thanks in advance for your kind reply.

    • @chowadagod
      @chowadagod Год назад +1

      @@workfloows I later on figured out the error . Thanks 🙏

  • @pratikguptaji
    @pratikguptaji 9 месяцев назад +1

  • @nocodecreative
    @nocodecreative 11 месяцев назад

    what about the puppeteer community node?

    • @workfloows
      @workfloows  11 месяцев назад +1

      Hello, thanks a lot for your comment and sorry for my late feedback.
      I had a chance to use Puppeteer community node, and unfortunately I find it a bit buggy. Since Puppeteer is also slightly demanding in terms of memory, it’s much more convenient for me to host it on GCF and perform calls when needed.
      But it’s just my experience - if Puppeteer community node works well for you, I don’t see a reason not to use it 😃

  • @haarTrick
    @haarTrick 10 месяцев назад

    What do you mean by undetected ? Shouldn't you get blocked after numbers of requests?

    • @workfloows
      @workfloows  10 месяцев назад

      Hello, thank you for your comment.
      As long as you use IP rotation, chances that the script will be permanently blocked are rather low (when one IP address is getting blocked, the other one is used in the next round).

    • @haarTrick
      @haarTrick 10 месяцев назад

      How can I apply IP Rotation on this method on veido ​@@workfloows

  • @ablae1234
    @ablae1234 Год назад

    Is it possible to take this code and modify it to scrape Reddit or any other website, or each site needs a different approach ?

    • @workfloows
      @workfloows  Год назад

      Hi, thank you for your comment!
      This code was created exclusively for Amazon, so scraping and retrieving data from Reddit is rather not possible with it. Of course, you can create your own scraping script using Puppeteer, and I strongly encourage you to do so - it's a lot of fun!

  • @wallpp
    @wallpp Год назад +1

    aguante el mate

  • @MK-jn9uu
    @MK-jn9uu Год назад

    What’s the purpose of alllll this, if you’re just going to use brightdata

    • @workfloows
      @workfloows  Год назад

      Hello, thank you for your comment.
      In the video description, you can find links to the Puppeteer code without Bright Data implemented. If you prefer not to use BD, please feel free to use these resources and adapt them to the requirements of any other proxy provider of your choice.