My new step-by-step program to get you your first automation customer is now live! Results guaranteed. Apply fast: skool.com/makerschool/about. Price increases every 10 members 🙏😤
Awesome tutorial Nick... I cant emphasise enough, not only how helpful this tutorial was but also the number of ideas this tutorial has given me - top 5 Channel for me!
Hi I want to say Thank you for being a great teacher. I appreciate you taking your time in explaining things.You are very easy to follow. I always look forward to your next video.
Hey Nick, love the videos!! just had a few questions. Would love if you could help us out. What is your business model like? Do you offer clients a subscription model? or a one shot payment? and what do you think we should apply to our business model as well, considering we r looking to rope in new clients and remain profitable over a period of time. I ask this since the websites we will be using have a monthly subscription fee and a limit on the API / operation requests and if the requests exceed the limit of the plan purchased, how do you tackle that? It would be of great help if you could make a short 10 min video on this or maybe a reply to the comment. Love the series!! Keep up the good work!!
First of all, thank you @Nick Saraev for this such useful knowledge. You only scraped one record but there are a lot of records on how we scrap those all. Please give me the answer I am working on such a project just for learning purposes.
Dude, I cannot overstate how mindblowing this series is. There were so many things in this video that I had absolutely no idea were possible. Also 1 bed 4 bath is crazy.
Hey Nick great video!! I just have a doubt. If you this module once for one url and then put it on sleep, how do you scrape the other urls. I didnt quite get the hang of it how it happens so it would be nice of you to explain in briefly. Thanks in advance!!
This might be too basic, but how do i understand Tokens, and how do I know what limit to set when I am logging into Chat GPT here. I dont know how to estimate what a job costs etc.
Another brilliant video, Nick! Would be awesome to get a more in depth tutorial about Regex or what to ask chatgpt about (What are we looking for specifically) in order to scrape. Were you a developer before? You seem to know a lot about webdev. Thanks again!
Very appreciative of what you’re doing with this series 🙏🏽 It’s becoming clear that having a solid understanding of JSON and regix is a must if you intend on building anything decently complex for clients. Any resources, courses, or forums you can point us towards? Thanks again!
Thank you Jeffrey 🙏 agree that it's important. Luckily AI is making it less so-if regex is currently the bottleneck in your flows, you can usually "cheat" by passing the input into GPT-4 with a prompt like "extract X". To answer your q, though, my education was basically: I watched a few RUclips videos, same as you, and now just use regex101 for fine tuning. Most of my parsers are extremely simple and real regex pros would laugh at them (but they work for me and my wallet!) .* is your friend Hope this helps man.
Unfortunately i couldn't get this to work. The parsed HTML seemed to have different data to your example and i couldn't figure out Regex. You mentioned it could be done with Chat GPT - it would be helpful to know that approach also.
Really great content, thank you for this video! If I wanted to optimize your flow, I would check if the URL is in the Google Sheets document before calling the parsed URL and extract the data on the page.
Hey Nick, amazing tutorial as always, you've massively helped me on so many flows - thank you! I actually managed to build a similar flow but instead of RegEx I used an anchor tag text parser with a filter that checked for the presence of a class of "page__link" from element type since all page links had that. Would you say there's anything wrong with this if it works for the use case?
Super insightful videos, much appreciated! Just fyi, in timestamp 28:20 you're trying to expand the window size. You can do this by clicking the little symbol with the 4 arrows.
Thank you very much Nick for your amazing videos! I'm a beginner and this question may sound dumb but I'm running a scenario with 2 text parsers following each other. The first one runs 1 operation but the one following that's using the same input data runs way more operations. Do you know where that could be coming from? No hard feeling if you don't have time to answer ;)
Again, your instructional video’s are so informative. Very much appreciated! Could you post how i can visit multiple websites from sheet? Would i add a sheet at the front and another at the the end to assess the next row?
Appreciate the support! Absolutely-here are steps for plugging multiple sites in: 1. Create a Google Sheet (sheets.new) with a column labelled "URL". 2. In the Make scenario builder, search for Google Sheets connectors. You're looking specifically for "Search Rows", which has a built in iterator. Make this the trigger of your flow. 3. Authorize your account, select the sheet from earlier in the modal, etc. Set "maximum number of returned rows" to however many you need. 4. Use the output from the "URL" column as the input to the rest of the flow you see in the video. Remember that since "Search Rows" is now a trigger, if you turn this scenario on it'll run every X minutes. So if you don't have a designated flow you might want to make it "on demand" and just run whenever you need to process sites/etc. You can then make another Google Sheet to collect the output and use the "Add Rows" module to fill it up. Hope this helps!
Hello, I had some ideas to get products from multiple platforms and then compare prices but I'm not sure the ideas I thought of are good., could you tell me what you think about it?
Managed to get a 200 reponse, on the first step. But it appears that some of the html is hidden. Seems like there is a delay in all the data being populated. I added all the header info. Thanks for the tutorial.
This is great and well explained. After watching the full length of the tutorial, I'd rather opt for using a web scrapper tool until I'm good with using regex. Btw any resources on learning regex?
Thx Tobi! Frankly I just use Regex101 for everything (regex101.com), the highlighting as you set your search up is extremely helpful. If you were to quiz me on tokens/selectors without a tool like this I'd probably know fewer than 50% of them 😂
Hi Nick, how you doing? First of all I wanna thank you for everything you are doing for us. I tried to use this automation on different websites but there are a lot of websites where the code has 2 or 3 times almost in a row the same link for the same house/product, so when you use regex you get repeated results. How can I filter this or put some type of condition that makes me not have duplicate results and save me operations? Thank you
I can't get past the HTML to Text module. It keeps giving me an error message: BundleValidationError. Maybe poor HTML on the website I'm scraping? Anyway, thanks for the information. So much to learn!
Thanks Artem 🙏 you'd create a separate route for the scraper so it can iterate over each page, then add each page's data to an array (using the add() function or similar). On your main route you'd then add a Get Variable module and pull the array contents. Hope this helps.
Glad I could help. Yes it works on Amazon, though be wary that their bot detection is much more sophisticated (see another comment where I discuss how to scrape reviews).
to get it right ... u feed the whole html-content to gpt. so u pay the input-tokens for all of this. isnt it possible just feed the body or a single container ID or Class ?
Thanks for making this video; very helpful with a few automation projects that I have. I've never heard of make before. I've been spending the last two years making a local webhook application as a side project that basically does the same thing as make, but this site is so much better.
You're very welcome! I'm a dev as well and find Make better for 99% of business use cases. The only time I build something out in code these days is when a flow is extremely operationally heavy. Keep me posted 🦾
Hi Nick thanks for the video, I'm having a problem with the parser...it's not parsing down the text for me like it show in the video. Any suggestions on this?
Thanks Raiheen! You could, although there are probably better solutions to this. Facebook/Twitter/etc often hide comments behind a "reveal" mechanism like scrolling or a "Read More" button which makes scraping them difficult (in addition to their security and tight rate limits). That said, anything is possible in 2024! You could run a search every X minutes using a search bar and scrape the top X comments. You'd use an intermediary DB of some kind to store the comment text, and then for every comment in your scrape, if that comment doesn't already exist, you could fire up a browser automation tool like Apify and log on to the platform in question. You'd then have GPT-4 or similar dream up a response and post it using JavaScript. Hope this helps man 🙏
How would I use this if I have to login to a site in order to scrape it? Is there a login prompt to add before the site prompt? Thanks for all the info!!!
Happy you found this valuable Daniel! It depends on the site-sometimes you can just pass a username/password in the HTTP request module to get the cookie, other times you need to use browser automation tools like Apify. I recorded an in-depth video on authentication here if you're interested: ruclips.net/video/R8o7V39NSSY/видео.html Hope this helps 🙏
Yes, definitely. You'd just replace the URL in the HTTP Request module with whatever your proxy is and then add the proxy-specific data (most proxies will require you to send credentials, the URL you want to passthrough, etc).
Great q. The flow automatically loops because the "Match Pattern" module outputs multiple bundles. When multiple bundles are output by a module, every module after that module runs anew for each respective bundle. Hope this helps 🙏
Absolutely, just checked for you. You have to do it in two parts: 1. Feed in the Amazon product URL to a Request module like I show in the video. Then scrape HTML and parse as text. 2. Somewhere in the resulting scrape will be a URL with a string like /product-reviews/. You need to match this (can use regex). Then make another request to that URL for product reviews. Amazon's bot detection is very good so be careful you don't get rate limited 🙏
but this only works if the content data you want to grab is only in text form on the web page, but if it is dynamically created let's say for a js script or something it wouldn't be able to grab the desire data... i.e... if i want to grab a price data from a web page... the content grab by Make a Request module would get something like this PRICE: $ 0.00, but in the web page it is like this PRICE: $ 3.70... but this last one is dynamically created and doesn't show this way in the make module...
The HTML objects I get, have >50,000 characters, and when trying to paste this back on a G-Spreadsheet cell, I get an error. Any tips how to reduce/clean-up the HTML object that I get back? For example, ScrapeNinja module offers a JavaScript field you can use, to filter this out on the go, but they have paid APIs :/
Agar WhatsApp pe contact block hota hain to call nahi lagta Hain . Message nahi jata hai, aur mere pass koi number nahi hain.to hum apna matha phone pe patakage to tumko telepathy se phone lagega?
Yes, definitely! PS the quality usually goes up if you let it output plaintext. This isn't as relevant for my purposes but something to keep in mind if you're generating content (say blogs etc)
I will absolutely do one on GHL, I used to sell their platform as an affiliate actually. Tbh I don't like their "automations" one bit but it's important enough to go through. Probably next month as I finish the course and the rest of my videos-thank you for the idea!
Absolutely. I did something similar for a data viz SaaS a while back. You'd have to find a way to parse each of those strings (Q&A, Company Name, Company Description, etc) and then pass them to your app db. You can use AI for this if there's no consistent pattern-something like "Categorize the following text into XYZ using the following JSON format". Hope this helps man 🙏
Hello, How can I convert a super large XML FILE (literally a huge stack of file archives) into a simple site Jyson? I'd sing at anyone's wedding if anyone can share.... lol jk but, truly would be very thankful for any suggestions And nice channel 100 you got a new sub here ⚜️
I dont get it... you said ... "by the end of it you'll know everything that you need to know about how to scrape like you'll be better than 99% of the rest of the world at scraping sites and you don't even really need to know like HTML or anything like that because we're going to use AI to help"... but if you dont know how to create key to connect openAI... or you have no idea what JSON is... C'mon! HTML is the easiest one of that... :( I was hopefull and eager to follow you.. now I'm in a rabbit whole
429 is basically always a rate limit error. You can either slow down, add "Break" modules to intelligently retry, use platforms like HookDeck to queue and then space out your requests, or send from different IP addresses. Hope this helps, Shreyas 🙏
My new step-by-step program to get you your first automation customer is now live! Results guaranteed.
Apply fast: skool.com/makerschool/about. Price increases every 10 members 🙏😤
After building OpenAI module am facing rate limit error. Even after upgrading to GPT 4o am facing same issue.
Any idea how do I fix this?
Awesome tutorial Nick... I cant emphasise enough, not only how helpful this tutorial was but also the number of ideas this tutorial has given me - top 5 Channel for me!
Really ? Who else being in a Top 4?
Hi I want to say Thank you for being a great teacher. I appreciate you taking your time in explaining things.You are very easy to follow. I always look forward to your next video.
You're very welcome Michelle!
@@nicksaraev how can one get to your mentorship/course please ?
I got the same question. It seems that it is still in construction. Because we get curriculum in the video description. @@johnringo6155
What did he write in the User Role - "Tell me about this website in JSON formate." What did he write after that?
3 minutes in and I know how to scrape a webpage and parse it to text. THANK YOU!!!!
Glad I could help!
Super helpful, thank you. Any chance you could do a tutorial on how to scrape sites that require logging in?
Hey Nick, love the videos!! just had a few questions. Would love if you could help us out. What is your business model like? Do you offer clients a subscription model? or a one shot payment? and what do you think we should apply to our business model as well, considering we r looking to rope in new clients and remain profitable over a period of time. I ask this since the websites we will be using have a monthly subscription fee and a limit on the API / operation requests and if the requests exceed the limit of the plan purchased, how do you tackle that? It would be of great help if you could make a short 10 min video on this or maybe a reply to the comment. Love the series!! Keep up the good work!!
One the best videos I have met this year so far..Thanks!
What did he write in the User Role - "Tell me about this website in JSON formate." What did he write after that?
First of all, thank you @Nick Saraev for this such useful knowledge. You only scraped one record but there are a lot of records on how we scrap those all. Please give me the answer I am working on such a project just for learning purposes.
Thank you so much Nick! Every time there is a brilliant video!
What did he write in the User Role - "Tell me about this website in JSON formate." What did he write after that?
Dude, I cannot overstate how mindblowing this series is. There were so many things in this video that I had absolutely no idea were possible.
Also 1 bed 4 bath is crazy.
Hell ya man! Glad I could help. And SF real estate smh
Hey Nick great video!! I just have a doubt. If you this module once for one url and then put it on sleep, how do you scrape the other urls. I didnt quite get the hang of it how it happens so it would be nice of you to explain in briefly. Thanks in advance!!
That's amazing, how did I miss this company. You got a new customer. Great job
This might be too basic, but how do i understand Tokens, and how do I know what limit to set when I am logging into Chat GPT here. I dont know how to estimate what a job costs etc.
BUMP!!! me too... no clue how to do this and I just put a random number in lol
Another brilliant video, Nick!
Would be awesome to get a more in depth tutorial about Regex or what to ask chatgpt about (What are we looking for specifically) in order to scrape. Were you a developer before? You seem to know a lot about webdev.
Thanks again!
What did he write in the User Role - "Tell me about this website in JSON formate." What did he write after that?
Very appreciative of what you’re doing with this series 🙏🏽
It’s becoming clear that having a solid understanding of JSON and regix is a must if you intend on building anything decently complex for clients. Any resources, courses, or forums you can point us towards?
Thanks again!
You can always ask ChatGPT for help with this kind of stuff. Explains to you in plain english
Thank you Jeffrey 🙏 agree that it's important. Luckily AI is making it less so-if regex is currently the bottleneck in your flows, you can usually "cheat" by passing the input into GPT-4 with a prompt like "extract X".
To answer your q, though, my education was basically: I watched a few RUclips videos, same as you, and now just use regex101 for fine tuning.
Most of my parsers are extremely simple and real regex pros would laugh at them (but they work for me and my wallet!) .* is your friend
Hope this helps man.
An engaging overview of the SmolLM model! Some alternative AI venues offer goods equivalent to those of other systems.
Unfortunately i couldn't get this to work. The parsed HTML seemed to have different data to your example and i couldn't figure out Regex. You mentioned it could be done with Chat GPT - it would be helpful to know that approach also.
Really great content, thank you for this video!
If I wanted to optimize your flow, I would check if the URL is in the Google Sheets document before calling the parsed URL and extract the data on the page.
Good thinking!
Really impressive stuff in this video, could it be used for a site that requires signing in too?
Hey Nick, amazing tutorial as always, you've massively helped me on so many flows - thank you! I actually managed to build a similar flow but instead of RegEx I used an anchor tag text parser with a filter that checked for the presence of a class of "page__link" from element type since all page links had that. Would you say there's anything wrong with this if it works for the use case?
I am curious how you would tackle getting around a "click to reveal" phone number. It requires 2 clicks to find the phone number.
Super insightful videos, much appreciated! Just fyi, in timestamp 28:20 you're trying to expand the window size. You can do this by clicking the little symbol with the 4 arrows.
Thank you very much Nick for your amazing videos! I'm a beginner and this question may sound dumb but I'm running a scenario with 2 text parsers following each other. The first one runs 1 operation but the one following that's using the same input data runs way more operations. Do you know where that could be coming from? No hard feeling if you don't have time to answer ;)
Again, your instructional video’s are so informative. Very much appreciated! Could you post how i can visit multiple websites from sheet? Would i add a sheet at the front and another at the the end to assess the next row?
Appreciate the support! Absolutely-here are steps for plugging multiple sites in:
1. Create a Google Sheet (sheets.new) with a column labelled "URL".
2. In the Make scenario builder, search for Google Sheets connectors. You're looking specifically for "Search Rows", which has a built in iterator. Make this the trigger of your flow.
3. Authorize your account, select the sheet from earlier in the modal, etc. Set "maximum number of returned rows" to however many you need.
4. Use the output from the "URL" column as the input to the rest of the flow you see in the video. Remember that since "Search Rows" is now a trigger, if you turn this scenario on it'll run every X minutes. So if you don't have a designated flow you might want to make it "on demand" and just run whenever you need to process sites/etc. You can then make another Google Sheet to collect the output and use the "Add Rows" module to fill it up.
Hope this helps!
awesome video i just have one question, how do you get it to cycle through all pf the links? and not just use the same exact one everytime?
Great stuff. the shared Hidden_API_Masterclass.json seems incomplete, would be great if the complete json could be shared.
Hi Nick! Thank you for your education! But.... How to solve the issue with Status code 403?
i think adding " " to the value section of the header's fixed this for me!
Hello, I had some ideas to get products from multiple platforms and then compare prices but I'm not sure the ideas I thought of are good., could you tell me what you think about it?
Managed to get a 200 reponse, on the first step. But it appears that some of the html is hidden. Seems like there is a delay in all the data being populated. I added all the header info. Thanks for the tutorial.
Great video! What if the page you're trying to scrape requires authentification? Like the "my profile" section of uber or ny other company.
I'm trying to do this for a car dealership but its not showing all of the other links of individual cars for sale
Masterclass Nick ! Thanks a lot for this video
This is great and well explained. After watching the full length of the tutorial, I'd rather opt for using a web scrapper tool until I'm good with using regex. Btw any resources on learning regex?
Thx Tobi! Frankly I just use Regex101 for everything (regex101.com), the highlighting as you set your search up is extremely helpful. If you were to quiz me on tokens/selectors without a tool like this I'd probably know fewer than 50% of them 😂
Don’t understand why you moved the last sleep before the sheets but otherwise great explanation
sometimes the regex say match on regex101 and then in integromat it doesn't..
came for the web scraping insights, stayed for the pearly white teeth
Brb getting a Colgate sponsorship
That was fascinating to watch and very clear explanation. Thank you for sharing. I am definitely subscribing!
Welcome aboard!
What did he write in the User Role - "Tell me about this website in JSON formate." What did he write after that?
Hi Nick, how you doing?
First of all I wanna thank you for everything you are doing for us.
I tried to use this automation on different websites but there are a lot of websites where the code has 2 or 3 times almost in a row the same link for the same house/product, so when you use regex you get repeated results. How can I filter this or put some type of condition that makes me not have duplicate results and save me operations?
Thank you
I can't get past the HTML to Text module. It keeps giving me an error message: BundleValidationError. Maybe poor HTML on the website I'm scraping? Anyway, thanks for the information. So much to learn!
Thanks so much for the tutorial! Just a question: how do you deal with pagination when scraping data?
Thanks Artem 🙏 you'd create a separate route for the scraper so it can iterate over each page, then add each page's data to an array (using the add() function or similar). On your main route you'd then add a Get Variable module and pull the array contents. Hope this helps.
@@nicksaraev Thank so much for sharing, Nick! Hopefully, I'll be able to help you somehow one day :)
Hey amazing work just you should cut in post what did not work I got so lost and trying to do at my home can't make it happen :(
I did exactly what was shown in the video, but I keep getting the error: Invalid number in parameter 'max_tokens'. How can I fix this?
So what about javascript loaded content? How to fetch that information?
Is there a way to use proxies for this? I just feel like it’d be pointless to get so deep into this without one
BRO YOU ARE ROCKING IT!!!!!
What did he write in the User Role - "Tell me about this website in JSON formate." What did he write after that?
this is pure gold :)
Thanks for the Tutorial.
Does this also work on Amazon listings?
Glad I could help. Yes it works on Amazon, though be wary that their bot detection is much more sophisticated (see another comment where I discuss how to scrape reviews).
Does it also crawl the entire website or just scrape the given URL?
Hey Nick, What would be a prompt of GTP4 to extract those URLS and build the regex?
Great Work. Can you make a tutorial how to scrape data from linkedin?
to get it right ... u feed the whole html-content to gpt. so u pay the input-tokens for all of this. isnt it possible just feed the body or a single container ID or Class ?
Thanks for making this video; very helpful with a few automation projects that I have. I've never heard of make before. I've been spending the last two years making a local webhook application as a side project that basically does the same thing as make, but this site is so much better.
You're very welcome! I'm a dev as well and find Make better for 99% of business use cases. The only time I build something out in code these days is when a flow is extremely operationally heavy. Keep me posted 🦾
Thanks Nick, super helpful. Will set this up right away :D
Hell ya man! Let me know how it goes.
What did he write in the User Role - "Tell me about this website in JSON formate." What did he write after that?
Fantastic stuff, thank you so much Nick!
Thank you for your sharing, it has truly benefited me a lot.
thanks for teaching us sir. Appreciate it!
How would you address the legality of scrapping?
Hi Nick thanks for the video, I'm having a problem with the parser...it's not parsing down the text for me like it show in the video. Any suggestions on this?
HELP! Hey guys, I couldn't get paste the 403 error you came across when you got to getting the individual listings!
What do I do?
After building OpenAI module am facing rate limit error. Even after upgrading to GPT 4o am facing same issue.
Any idea how do I fix this?
Hi Nick, is it possible to scrape a page that does not have an API and that you have to be logged into, please?
How do you get past authentication to scrape for resources that require a sign in?
Gonna check out, wonder if could be used for comments on a post, or twitter, example someone saying they want something, than boom you can respond
Thanks Raiheen! You could, although there are probably better solutions to this. Facebook/Twitter/etc often hide comments behind a "reveal" mechanism like scrolling or a "Read More" button which makes scraping them difficult (in addition to their security and tight rate limits).
That said, anything is possible in 2024! You could run a search every X minutes using a search bar and scrape the top X comments. You'd use an intermediary DB of some kind to store the comment text, and then for every comment in your scrape, if that comment doesn't already exist, you could fire up a browser automation tool like Apify and log on to the platform in question. You'd then have GPT-4 or similar dream up a response and post it using JavaScript.
Hope this helps man 🙏
In the OpenAI does it need payment to generate an output (credits) or you just need to get the api key and that’s it?
so it scrapes but you have to sign up.. hardly feels private does it
How would I use this if I have to login to a site in order to scrape it? Is there a login prompt to add before the site prompt? Thanks for all the info!!!
Happy you found this valuable Daniel!
It depends on the site-sometimes you can just pass a username/password in the HTTP request module to get the cookie, other times you need to use browser automation tools like Apify. I recorded an in-depth video on authentication here if you're interested: ruclips.net/video/R8o7V39NSSY/видео.html
Hope this helps 🙏
Have you caught wind of VideoGPT making waves? It's your ticket to seamless, professional video content.
How can I code the headers for scraping data from TikTok? Is a specific type of header required to imitate a legitimate user or device?
Thank you for the information. BTW, your copyright has not been updated. :)
how do you scrape in the same process the images?
Is it possible to scrape Wikipedia?
Not working following your steps
What camera are you using, is it Lumia by any chance? Thanks!
Because of this comment & a few others, I just published a full gear list in the description! Including camera, lens, lighting, etc :-) all the best
Nicely brought!
So happy you found value in this man.
First, you should explain what scraping a website is. 🤔
There are other videos for that this is for those taking the next step
Curious why you decided to watch the video if you didn’t know what it was
Can we add proxies to these flows?
Yes, definitely. You'd just replace the URL in the HTTP Request module with whatever your proxy is and then add the proxy-specific data (most proxies will require you to send credentials, the URL you want to passthrough, etc).
@@nicksaraev can you show us an example of this?
You only logged the first listing on your Redfin search. How does it loop to the second and so on?
Great q. The flow automatically loops because the "Match Pattern" module outputs multiple bundles. When multiple bundles are output by a module, every module after that module runs anew for each respective bundle.
Hope this helps 🙏
Thank you, can I use this to web scrape all reviews of a product on amazon
Absolutely, just checked for you. You have to do it in two parts:
1. Feed in the Amazon product URL to a Request module like I show in the video. Then scrape HTML and parse as text.
2. Somewhere in the resulting scrape will be a URL with a string like /product-reviews/. You need to match this (can use regex). Then make another request to that URL for product reviews.
Amazon's bot detection is very good so be careful you don't get rate limited 🙏
but this only works if the content data you want to grab is only in text form on the web page, but if it is dynamically created let's say for a js script or something it wouldn't be able to grab the desire data... i.e... if i want to grab a price data from a web page... the content grab by Make a Request module would get something like this PRICE: $ 0.00, but in the web page it is like this PRICE: $ 3.70... but this last one is dynamically created and doesn't show this way in the make module...
Thanks for bringing this up. Will cover this in an updated video 🙏
Great info!
The HTML objects I get, have >50,000 characters, and when trying to paste this back on a G-Spreadsheet cell, I get an error.
Any tips how to reduce/clean-up the HTML object that I get back?
For example, ScrapeNinja module offers a JavaScript field you can use, to filter this out on the go, but they have paid APIs :/
Try split function to divide the data in pieces and then send them to splited cells in GS
Great job ! thank you
Doesn`t make support css selectors?
How about content behind a pay wall?
Just recorded a video to answer this (hidden APIs)! Hope it helps you.
Skip to 2:47
Agar WhatsApp pe contact block hota hain to call nahi lagta Hain . Message nahi jata hai, aur mere pass koi number nahi hain.to hum apna matha phone pe patakage to tumko telepathy se phone lagega?
It appears the free version of Chat GPT doesn't work with this. Still, interesting
At the 13:31 mark, could you have your ChatGPT with custom instructions entered versus writing JSON to get a better email intro?
Yes, definitely! PS the quality usually goes up if you let it output plaintext. This isn't as relevant for my purposes but something to keep in mind if you're generating content (say blogs etc)
my friend thank you so much for your videos, I really appreciate it, again any Go High Level Platform Review?
I will absolutely do one on GHL, I used to sell their platform as an affiliate actually. Tbh I don't like their "automations" one bit but it's important enough to go through. Probably next month as I finish the course and the rest of my videos-thank you for the idea!
@@nicksaraev thanks buddy, I tried it but it was a bit overwhelming for a beginner.
Great Job!
Thank you Terry!
Golden content
So glad you find it valuable man
I don't understand how to find a regex
How to bypass the robo.txt file blocking the scraper?
Same would like to know the answer to this
You can't with this I am pretty sure
need to scrape website for an ai web app to allow me to put Q&A, company info etc to fields on web app is that possible?
Absolutely. I did something similar for a data viz SaaS a while back. You'd have to find a way to parse each of those strings (Q&A, Company Name, Company Description, etc) and then pass them to your app db. You can use AI for this if there's no consistent pattern-something like "Categorize the following text into XYZ using the following JSON format".
Hope this helps man 🙏
Hello,
How can I convert a super large XML FILE (literally a huge stack of file archives) into a simple site Jyson?
I'd sing at anyone's wedding if anyone can share.... lol jk but, truly would be very thankful for any suggestions
And nice channel 100 you got a new sub here ⚜️
I dont get it... you said ... "by the end of it you'll know everything that you need to know about how to scrape like you'll be better than 99% of the rest of the world at scraping
sites and you don't even really need to know like HTML or anything like that because we're going to use AI to help"... but if you dont know how to create key to connect openAI... or you have no idea what JSON is... C'mon! HTML is the easiest one of that... :( I was hopefull and eager to follow you.. now I'm in a rabbit whole
how to tackle error 429?
429 is basically always a rate limit error. You can either slow down, add "Break" modules to intelligently retry, use platforms like HookDeck to queue and then space out your requests, or send from different IP addresses. Hope this helps, Shreyas 🙏
Hi, are you available to build something like this fo me or can you refer someone? Thanks Dan
Hey Dan, happy to chat. Shoot me an email at nick@leftclick.ai and if I can't help I'll refer you.
This is very good explained!!!!!!!!
can you scrape banking account ?
Only my own 😫