The best solution imo is NodeJS + Puppeteer + puppeteer-extra-plugin-stealth plugin. It's free, doesn't rely on any 3rd party APIs and works 100% to avoid cloudflare blocking and other captchas. You can even log into any website, even if it uses OAuth for Google, Facebook, Amazon, Microsoft, Twitter, Apple etc.
Although there are many software solutions for automating and extracting data from a website, using NodeJS and its library ecosystem remains the most flexible option offering endless possibilities.
Another sticknote, like the documentation says, is not a web browser instance, it just takes the html to interpretate and do the job, so, if we do this stuff on websites that don't do server side rendering at all, will be missing some information since maybe it's loaded by external sources, like multiple scripts, external call apis, etc.
This sticknote is more than just a note. It’s the difference between pulling your hair out and understanding right away why some values are populated and some are not.
great job I followed your steps and really it was fantastic, I am a data scientist and you impressed me. God bless you and if you need anything like Machine learning I am working on algorithms.
@@aniakubow You're welcome :) Ideally put the most relevant words first, as it's likely algos will regard those as more important than later words. So instead of say "Make your videos better on RUclips", you should have "RUclips videos-improve yours". Also, it's usually better to use positive words rather than negative-eg "You will win" > "You can't lose". So: Scrape ALL Data Scrape EVERYthing Scrape and Catch ALL Data Emphasize ALL and EVERY, because that's the unique point of this video-if everything is in CAPS, then nothing is emphasized.
At the last video there was axios + express module , but i tried it on react result was CORS errors. Maybe this video is going to tell about that kind of errors and maybe about proxy set ups.
Great video! I've had some better experience scraping using xpaths instead of classnames in sites which dynamically generate the classnames. But it seems to go down to the content being scraped. Scraping using CSS selectors seems to be faster also.
Killer look, light pink that's definitely you. Scrap 'UNSCRAPPABLE' data yeah I'm in, I'll be back, spoken in an Arnold hillbilly German accent. Love your stuff GO Ania.
It doesn't have to feel annoying. Just tap/click the notify button, then put it out of your mind and move on to thinking about literally anything else in the world.
I'll second that. I'd even go as far as calling all those announcements years in advance spam, literally made me unsubscribe from this channel. Now, that's not to say that the content itself isn't of high quality. Ania is a real gem - I keep checking back occasionally. 👍
WOW amazing tutorial! I love your style and your approach. I am starting web development. I want to learn Vanilla JS your way. What is the best practice to learn and retain the methodology of JS? Please help :)
Yeap, she's just getting subscribers off her looks, and using these stupid sponsors as her "content". I disliked this video, and another one. In watching the previous one I couldn't figure out whether she just can't type or she doesn't really know what the heck she's talking about.
@Code with Ania Kubów,Hi, your video of the battleship is unavailable.Can you please look into it ? Because your video is the part one and the part 2 & 3 three is working. I am trying to study the game logic and it will be very helpful if you can re-upload your video.Thank you.
Hey ania can you also include the part where you can store the fetched data in a database(Like mongodb) and then show the user. it would be a great help OwO OwO
I am almost embarrassed to admit on how much easier it is to learn such stuff when your teacher is just smokin' hot :D besides being an amazing teacher already, dont get me wrong :)
Thank you for the great content. I have a request because I've been searching all over to find a good explanation on how to scrape pages that have a load more button - NOT DIFFERENT PAGES - using Cheerio and Puppeteer. I can scrape a page when it's auto-loading when scrolling down but still couldn't make it by clicking the load more button😭. Thank you.
@@qualitytransportation I know that it should click, but whenever I try it's not working. I mean the puppeteer will not click the load more. I did navigate the click button and but I don't know why it's not working.
um, update the old video so that it actually works then do this christ id like to do your projects but id ont know this node.js technology for new versions!
The best solution imo is NodeJS + Puppeteer + puppeteer-extra-plugin-stealth plugin.
It's free, doesn't rely on any 3rd party APIs and works 100% to avoid cloudflare blocking and other captchas. You can even log into any website, even if it uses OAuth for Google, Facebook, Amazon, Microsoft, Twitter, Apple etc.
Is there any Python related option? Ive read about stealth plugin, it seems its great
Although there are many software solutions for automating and extracting data from a website, using NodeJS and its library ecosystem remains the most flexible option offering endless possibilities.
See you in 3 days, Mother of Dragons.
See you there 🐉👑
Pozdrawiam z Polski i życzę dalszych sukcesów w rozwoju kanału!
Dziekuje 😍😍
A little notecheck at 7:08, use the -D flag when installing nodemon, nodemon is just for development on this example
Another sticknote, like the documentation says, is not a web browser instance, it just takes the html to interpretate and do the job, so, if we do this stuff on websites that don't do server side rendering at all, will be missing some information since maybe it's loaded by external sources, like multiple scripts, external call apis, etc.
This sticknote is more than just a note. It’s the difference between pulling your hair out and understanding right away why some values are populated and some are not.
Yes, I have also used cheeriojs with react native as an experiment and it worked well.
great job I followed your steps and really it was fantastic, I am a data scientist and you impressed me. God bless you and if you need anything like Machine learning I am working on algorithms.
would love to see your computer setup, your desk, keyboard chair etc :)
Ania, fyi in case it might affect the algorithm: "unscrappable" should only have one 'P'-ie "unscrapable" :)
Also, not sure it's a common word.
You make a very good point! Thanks for having my back 🙌🙌🙌. What is a better title do you think?
@@aniakubow You're welcome :)
Ideally put the most relevant words first, as it's likely algos will regard those as more important than later words. So instead of say "Make your videos better on RUclips", you should have "RUclips videos-improve yours". Also, it's usually better to use positive words rather than negative-eg "You will win" > "You can't lose". So:
Scrape ALL Data
Scrape EVERYthing
Scrape and Catch ALL Data
Emphasize ALL and EVERY, because that's the unique point of this video-if everything is in CAPS, then nothing is emphasized.
Nice one Ania - this is really great
You're the best Ania! Thank you so much!
Good Work Annia!
Great Content, as usual, thank you so much for sharing it with us, I know how hard is to build a project then edit it, post...
Thanks🙏
Thanks - very useful as usual :)
Thank you so much Ania 🥰
At the last video there was axios + express module , but i tried it on react result was CORS errors. Maybe this video is going to tell about that kind of errors and maybe about proxy set ups.
I hope it solves your issues too :)
Ty for these tutorials!
Amazing content! I'd be curious how to scrape/store data in a database and use that for my own frontend.
That was great Ania.....................take care ........................:) bye
The problem with managed one is the cost. For custom one, you can pay for as low as $19/month for 100,000 pages. It's also not hard to scale.
great video, keep up the good work
Looking forward to this.
Great video! I've had some better experience scraping using xpaths instead of classnames in sites which dynamically generate the classnames. But it seems to go down to the content being scraped. Scraping using CSS selectors seems to be faster also.
Killer look, light pink that's definitely you. Scrap 'UNSCRAPPABLE' data yeah I'm in, I'll be back, spoken in an Arnold hillbilly German accent. Love your stuff GO Ania.
After viewing this video it would be interesting to see what we can do to prevent others from scraping our own website projects. 😅
Hey Ania, do you know how to scrape websites blocked by Cloud Flare? X
Do you have your series 7 ?
Thank you so much !
Amazing you are ❤
I think putting a premiere 24 hours would be better. This long wait feels annoying!
It doesn't have to feel annoying. Just tap/click the notify button, then put it out of your mind and move on to thinking about literally anything else in the world.
I'll second that. I'd even go as far as calling all those announcements years in advance spam, literally made me unsubscribe from this channel.
Now, that's not to say that the content itself isn't of high quality. Ania is a real gem - I keep checking back occasionally. 👍
WOW amazing tutorial! I love your style and your approach. I am starting web development. I want to learn Vanilla JS your way. What is the best practice to learn and retain the methodology of JS? Please help :)
So.. if u want to scrap a dynamic web just go the sponsor of this video.... really?
Yeap, she's just getting subscribers off her looks, and using these stupid sponsors as her "content". I disliked this video, and another one. In watching the previous one I couldn't figure out whether she just can't type or she doesn't really know what the heck she's talking about.
Thanks for the video, can this also scrape out Instagram HTML content?
I came here to learn, instead i fell in love :D
@Code with Ania Kubów,Hi, your video of the battleship is unavailable.Can you please look into it ? Because your video is the part one and the part 2 & 3 three is working. I am trying to study the game logic and it will be very helpful if you can re-upload your video.Thank you.
I wish there was an npm Ania command, because she is the total package. 😉
Hey ania can you also include the part where you can store the fetched data in a database(Like mongodb) and then show the user. it would be a great help OwO OwO
Supabase is best choice
Can u make myntra scrapper video
I love you so much! You are the best!)
you are !
I need to do it with more than 5000+ products and also need description and price and etc how can I do it
Queen 👸
Im trying to do the same with twitter to get the tweets from any user, and it seems imposible. Could you help me?
I am almost embarrassed to admit on how much easier it is to learn such stuff when your teacher is just smokin' hot :D
besides being an amazing teacher already, dont get me wrong :)
I am stuck on npm init, not sure how to follow instructions. Please help
No lo quiero, lo necesito
Hi Ania! 🙂🌸🏵🌹🌺🌼🌻🌷
Thank you for your kiss! You have made my day! 🙂🌺
See you soon Teacher
didn´t work to me
At such times I would say... AI must understand what to scrape.
How did you get that accent?
Hi
Can u please explain how to scrape email from LinkedIn
I think the video should help with that :)
Thank you for the great content. I have a request because I've been searching all over to find a good explanation on how to scrape pages that have a load more button - NOT DIFFERENT PAGES - using Cheerio and Puppeteer. I can scrape a page when it's auto-loading when scrolling down but still couldn't make it by clicking the load more button😭.
Thank you.
Just click it with puppeteer then load with cheerio
@@qualitytransportation I know that it should click, but whenever I try it's not working. I mean the puppeteer will not click the load more. I did navigate the click button and but I don't know why it's not working.
Thanks mam
hi ania
hiya!
Mam I am waiting.Why you did not list this video on top?
Oh I am not sure! Weird 👀
Lovely
>How to scrape data
>Use paid service that sponsr this video
ayyyyyyyyyy lmao
I show two ways to do it so you can choose :)
Hola 👋
I'm here to learn. 🙄
ale się produkujesz ;) scrapowanie to ciężka sprawa....... sam ostatnio bawię się w diffbot'a
I need you as my technical partner
I personally use jsdom don't know why lol
👍
I think someone is trolling off your comments.
How to scrape Formula 1 data ?
This video should help I think :)
ты супер !
how to scrape your ❤
🥳
Titanic was lost in your bright eyes.... lovely, lovely you...
😜
I think there is a whole generation of programmers in love with her :))
Nothing but a sponsor video.
I wish so much I had a girlfriend just like you...smart, beautiful and a coder!!
😱😇
What accent is that?
Как обычно, все "очень просто"! ) Как её смотреть то? Стояк мешает )
My cyber girlfriend the smartest woman I know. You have my undying love, respect and devotion 🥰 I can't wait seriously on the edge of my seat 🤓
Using PHP or Perl?
um,
update the old video so that it actually works
then do this
christ id like to do your projects but id ont know this node.js technology for new versions!
You can change the version of node.js to the one I am using in the video. Just check the package.json for the version :)
SCRAPE ME! Do you have an OF?