🛑 How to Scrape UNSCRAPABLE data! (super simple!) Node.js + API

Поделиться
HTML-код
  • Опубликовано: 24 дек 2024

Комментарии • 107

  • @Hobbitstomper
    @Hobbitstomper 2 года назад +18

    The best solution imo is NodeJS + Puppeteer + puppeteer-extra-plugin-stealth plugin.
    It's free, doesn't rely on any 3rd party APIs and works 100% to avoid cloudflare blocking and other captchas. You can even log into any website, even if it uses OAuth for Google, Facebook, Amazon, Microsoft, Twitter, Apple etc.

    • @markomarjanovic8348
      @markomarjanovic8348 Месяц назад

      Is there any Python related option? Ive read about stealth plugin, it seems its great

  • @gaia2933
    @gaia2933 Год назад +1

    Although there are many software solutions for automating and extracting data from a website, using NodeJS and its library ecosystem remains the most flexible option offering endless possibilities.

  • @dystopian_1
    @dystopian_1 2 года назад +17

    See you in 3 days, Mother of Dragons.

    • @aniakubow
      @aniakubow  2 года назад +1

      See you there 🐉👑

  • @marekr.9339
    @marekr.9339 2 года назад +4

    Pozdrawiam z Polski i życzę dalszych sukcesów w rozwoju kanału!

  • @joseluisperez5137
    @joseluisperez5137 Год назад +1

    A little notecheck at 7:08, use the -D flag when installing nodemon, nodemon is just for development on this example

    • @joseluisperez5137
      @joseluisperez5137 Год назад +1

      Another sticknote, like the documentation says, is not a web browser instance, it just takes the html to interpretate and do the job, so, if we do this stuff on websites that don't do server side rendering at all, will be missing some information since maybe it's loaded by external sources, like multiple scripts, external call apis, etc.

    • @producdevity
      @producdevity Год назад

      This sticknote is more than just a note. It’s the difference between pulling your hair out and understanding right away why some values are populated and some are not.

  • @shrikantjha5630
    @shrikantjha5630 Год назад

    Yes, I have also used cheeriojs with react native as an experiment and it worked well.

  • @hazemelbatawy1242
    @hazemelbatawy1242 Год назад

    great job I followed your steps and really it was fantastic, I am a data scientist and you impressed me. God bless you and if you need anything like Machine learning I am working on algorithms.

  • @irobot8297
    @irobot8297 2 года назад

    would love to see your computer setup, your desk, keyboard chair etc :)

  • @silversolver7809
    @silversolver7809 2 года назад +7

    Ania, fyi in case it might affect the algorithm: "unscrappable" should only have one 'P'-ie "unscrapable" :)
    Also, not sure it's a common word.

    • @aniakubow
      @aniakubow  2 года назад +1

      You make a very good point! Thanks for having my back 🙌🙌🙌. What is a better title do you think?

    • @silversolver7809
      @silversolver7809 2 года назад +2

      @@aniakubow You're welcome :)
      Ideally put the most relevant words first, as it's likely algos will regard those as more important than later words. So instead of say "Make your videos better on RUclips", you should have "RUclips videos-improve yours". Also, it's usually better to use positive words rather than negative-eg "You will win" > "You can't lose". So:
      Scrape ALL Data
      Scrape EVERYthing
      Scrape and Catch ALL Data
      Emphasize ALL and EVERY, because that's the unique point of this video-if everything is in CAPS, then nothing is emphasized.

  • @SamLinnett
    @SamLinnett 2 года назад

    Nice one Ania - this is really great

  • @PySnek
    @PySnek 2 года назад

    You're the best Ania! Thank you so much!

  • @Crakkovia
    @Crakkovia 2 года назад

    Good Work Annia!

  • @DevMadeEasy
    @DevMadeEasy 2 года назад +2

    Great Content, as usual, thank you so much for sharing it with us, I know how hard is to build a project then edit it, post...
    Thanks🙏

  • @paulthomas1052
    @paulthomas1052 2 года назад +1

    Thanks - very useful as usual :)

  • @richardmasters2045
    @richardmasters2045 2 года назад

    Thank you so much Ania 🥰

  • @ROVAKAN
    @ROVAKAN 2 года назад +2

    At the last video there was axios + express module , but i tried it on react result was CORS errors. Maybe this video is going to tell about that kind of errors and maybe about proxy set ups.

    • @aniakubow
      @aniakubow  2 года назад

      I hope it solves your issues too :)

  • @ThineBrownSmear
    @ThineBrownSmear 2 года назад

    Ty for these tutorials!

  • @drucifer6
    @drucifer6 2 года назад +4

    Amazing content! I'd be curious how to scrape/store data in a database and use that for my own frontend.

  • @thunde7226
    @thunde7226 2 года назад

    That was great Ania.....................take care ........................:) bye

  • @BreakfastCupNoodles
    @BreakfastCupNoodles Год назад

    The problem with managed one is the cost. For custom one, you can pay for as low as $19/month for 100,000 pages. It's also not hard to scale.

  • @coderizer
    @coderizer 2 года назад

    great video, keep up the good work

  • @joelayoub2774
    @joelayoub2774 2 года назад

    Looking forward to this.

  • @NAHChannel
    @NAHChannel 2 года назад +1

    Great video! I've had some better experience scraping using xpaths instead of classnames in sites which dynamically generate the classnames. But it seems to go down to the content being scraped. Scraping using CSS selectors seems to be faster also.

  • @mgusa9372
    @mgusa9372 2 года назад

    Killer look, light pink that's definitely you. Scrap 'UNSCRAPPABLE' data yeah I'm in, I'll be back, spoken in an Arnold hillbilly German accent. Love your stuff GO Ania.

  • @philipbengtsson2186
    @philipbengtsson2186 2 года назад

    After viewing this video it would be interesting to see what we can do to prevent others from scraping our own website projects. 😅

  • @2ru2pacFan
    @2ru2pacFan 2 года назад +1

    Hey Ania, do you know how to scrape websites blocked by Cloud Flare? X

  • @thefeelingofunfair4052
    @thefeelingofunfair4052 2 года назад

    Do you have your series 7 ?

  • @hassaneoutouaya
    @hassaneoutouaya 2 года назад

    Thank you so much !

  • @AMoktar
    @AMoktar 2 года назад

    Amazing you are ❤

  • @briandsouza7854
    @briandsouza7854 2 года назад +17

    I think putting a premiere 24 hours would be better. This long wait feels annoying!

    • @StephenChapman
      @StephenChapman 2 года назад +2

      It doesn't have to feel annoying. Just tap/click the notify button, then put it out of your mind and move on to thinking about literally anything else in the world.

    • @christian-schubert
      @christian-schubert 2 года назад

      I'll second that. I'd even go as far as calling all those announcements years in advance spam, literally made me unsubscribe from this channel.
      Now, that's not to say that the content itself isn't of high quality. Ania is a real gem - I keep checking back occasionally. 👍

  • @Erwin_t
    @Erwin_t 2 года назад

    WOW amazing tutorial! I love your style and your approach. I am starting web development. I want to learn Vanilla JS your way. What is the best practice to learn and retain the methodology of JS? Please help :)

  • @joelapablaza7722
    @joelapablaza7722 Год назад +6

    So.. if u want to scrap a dynamic web just go the sponsor of this video.... really?

    • @atlantic_love
      @atlantic_love Год назад +1

      Yeap, she's just getting subscribers off her looks, and using these stupid sponsors as her "content". I disliked this video, and another one. In watching the previous one I couldn't figure out whether she just can't type or she doesn't really know what the heck she's talking about.

  • @godswillhycinth9809
    @godswillhycinth9809 Год назад

    Thanks for the video, can this also scrape out Instagram HTML content?

  • @urosjovicic3988
    @urosjovicic3988 Год назад

    I came here to learn, instead i fell in love :D

  • @srishimalah9561
    @srishimalah9561 2 года назад

    @Code with Ania Kubów,Hi, your video of the battleship is unavailable.Can you please look into it ? Because your video is the part one and the part 2 & 3 three is working. I am trying to study the game logic and it will be very helpful if you can re-upload your video.Thank you.

  • @js_models
    @js_models 2 года назад

    I wish there was an npm Ania command, because she is the total package. 😉

  • @Bot-kl1gs
    @Bot-kl1gs 2 года назад +4

    Hey ania can you also include the part where you can store the fetched data in a database(Like mongodb) and then show the user. it would be a great help OwO OwO

    • @avenazpk
      @avenazpk 2 года назад

      Supabase is best choice

  • @golgappayadav1864
    @golgappayadav1864 10 месяцев назад

    Can u make myntra scrapper video

  • @aleksandrkobelev8868
    @aleksandrkobelev8868 2 года назад

    I love you so much! You are the best!)

  • @ClimbHighWithAI
    @ClimbHighWithAI 10 месяцев назад

    I need to do it with more than 5000+ products and also need description and price and etc how can I do it

  • @1A_B_C1
    @1A_B_C1 2 года назад +1

    Queen 👸

  • @miguelbcn
    @miguelbcn Год назад

    Im trying to do the same with twitter to get the tweets from any user, and it seems imposible. Could you help me?

  • @i-am-your-conscience
    @i-am-your-conscience 2 года назад

    I am almost embarrassed to admit on how much easier it is to learn such stuff when your teacher is just smokin' hot :D
    besides being an amazing teacher already, dont get me wrong :)

  • @halowarstier3147
    @halowarstier3147 Год назад

    I am stuck on npm init, not sure how to follow instructions. Please help

  • @techwithulises
    @techwithulises 2 года назад

    No lo quiero, lo necesito

  • @VesuviusAntaria
    @VesuviusAntaria 2 года назад

    Hi Ania! 🙂🌸🏵🌹🌺🌼🌻🌷

    • @VesuviusAntaria
      @VesuviusAntaria 2 года назад

      Thank you for your kiss! You have made my day! 🙂🌺

  • @christiandanielmoralesagui4659
    @christiandanielmoralesagui4659 2 года назад

    See you soon Teacher

  • @jesusmoran1356
    @jesusmoran1356 Год назад +1

    didn´t work to me

  • @pastuh
    @pastuh 2 года назад

    At such times I would say... AI must understand what to scrape.

  • @kacperkepinski4990
    @kacperkepinski4990 2 года назад

    How did you get that accent?

  • @startupshorts
    @startupshorts 2 года назад

    Hi
    Can u please explain how to scrape email from LinkedIn

    • @aniakubow
      @aniakubow  2 года назад

      I think the video should help with that :)

  • @albaraasaad4498
    @albaraasaad4498 Год назад

    Thank you for the great content. I have a request because I've been searching all over to find a good explanation on how to scrape pages that have a load more button - NOT DIFFERENT PAGES - using Cheerio and Puppeteer. I can scrape a page when it's auto-loading when scrolling down but still couldn't make it by clicking the load more button😭.
    Thank you.

    • @qualitytransportation
      @qualitytransportation Год назад

      Just click it with puppeteer then load with cheerio

    • @albaraasaad4498
      @albaraasaad4498 Год назад

      @@qualitytransportation I know that it should click, but whenever I try it's not working. I mean the puppeteer will not click the load more. I did navigate the click button and but I don't know why it's not working.

  • @screendice4107
    @screendice4107 2 года назад

    Thanks mam

  • @ashwinr8317
    @ashwinr8317 2 года назад

    hi ania

  • @screendice4107
    @screendice4107 2 года назад

    Mam I am waiting.Why you did not list this video on top?

    • @aniakubow
      @aniakubow  2 года назад

      Oh I am not sure! Weird 👀

  • @Dan1eleduardooo
    @Dan1eleduardooo 2 года назад

    Lovely

  • @socar-pl
    @socar-pl 2 года назад +1

    >How to scrape data
    >Use paid service that sponsr this video
    ayyyyyyyyyy lmao

    • @aniakubow
      @aniakubow  2 года назад

      I show two ways to do it so you can choose :)

  • @luissosa7685
    @luissosa7685 2 года назад

    Hola 👋

  • @michaelallen1154
    @michaelallen1154 2 года назад

    I'm here to learn. 🙄

  • @gileneusz
    @gileneusz Год назад

    ale się produkujesz ;) scrapowanie to ciężka sprawa....... sam ostatnio bawię się w diffbot'a

  • @rachest
    @rachest 2 года назад

    I need you as my technical partner

  • @abhishekkaith1686
    @abhishekkaith1686 Год назад

    I personally use jsdom don't know why lol

  • @desi_vlogs005
    @desi_vlogs005 2 года назад

    👍

  • @roostermarques3583
    @roostermarques3583 2 года назад

    I think someone is trolling off your comments.

  • @balakumar.n4891
    @balakumar.n4891 2 года назад

    How to scrape Formula 1 data ?

    • @aniakubow
      @aniakubow  2 года назад

      This video should help I think :)

  • @---fq2kd
    @---fq2kd 2 года назад

    ты супер !

  • @jeanmi8184
    @jeanmi8184 2 года назад

    how to scrape your ❤

  • @Jibs-HappyDesigns-990
    @Jibs-HappyDesigns-990 2 года назад

    🥳

  • @dystopian_1
    @dystopian_1 2 года назад +2

    Titanic was lost in your bright eyes.... lovely, lovely you...

    • @Qasim6
      @Qasim6 2 года назад

      😜

    • @trolley2327
      @trolley2327 Год назад

      I think there is a whole generation of programmers in love with her :))

  • @atlantic_love
    @atlantic_love Год назад +1

    Nothing but a sponsor video.

  • @wgalloPT
    @wgalloPT 2 года назад

    I wish so much I had a girlfriend just like you...smart, beautiful and a coder!!

  • @AADJgroup
    @AADJgroup 2 года назад

    😱😇

  • @ThreeLions82
    @ThreeLions82 Месяц назад

    What accent is that?

  • @yobi3d
    @yobi3d 2 года назад

    Как обычно, все "очень просто"! ) Как её смотреть то? Стояк мешает )

  • @richardmasters2045
    @richardmasters2045 2 года назад +2

    My cyber girlfriend the smartest woman I know. You have my undying love, respect and devotion 🥰 I can't wait seriously on the edge of my seat 🤓

  • @illegalsmirf
    @illegalsmirf 2 года назад

    Using PHP or Perl?

  • @AlexanderMoyer-k3b
    @AlexanderMoyer-k3b Год назад

    um,
    update the old video so that it actually works
    then do this
    christ id like to do your projects but id ont know this node.js technology for new versions!

    • @aniakubow
      @aniakubow  Год назад

      You can change the version of node.js to the one I am using in the video. Just check the package.json for the version :)

  • @code.design
    @code.design 2 года назад +2

    SCRAPE ME! Do you have an OF?