Amazing video! I really like all of your videos and consider you my most favorite code tutor, you have a very understandable and hardworked content. Thank you very much for your great journey on webscraping world, i will always awe you for the knowledge you share and efforts you give to find them. Thanks truly.
Thanks John 👍👍 The website has changed and while I couldn't follow your example exactly I did use the codegen and extracted the total price of the cart. Couldn't use page.query_selector and instead used: page.frame_locator("#iFrameResizer0").locator('div[data-test=cart-total] span[data-test=cart-price-value]').text_content() Seems a bit messy and can understand why it might be easier to get it to BeautifulSoup. Thanks again. Great video and great tool.
That’s great! I find this sort of method, especially using codegen is best for automating tasks rather than information gathering - which I much prefer to do other ways like you mentioned
Can you create a video show best way to prevent site detecting you from using scrapping bot? I try a project before using different way to get Zillow data for project, but keep pop up captcha.
For the rough idea You can use existing chrome/Firefox browser with remote defined port so that you can do your work after captache I did the same when I was working with selenium and the same bot detection issue I was facing so that existing browser use helped a lot to me🙂
I think that playwright is a nice automation tool but the documentation and the amount of tutorial are kinda scarce when we talk about Python with Pytest and playwright. Also hard to find tutorials about page object model witb python/playwright. Basically playwright is new and shiny but comes at a cost. The developers of playwright mostly work in JavaScript, which I don't use.
Hi how can we do this with playwright or if is it even possible? in Scrapy response.replace(body=webdriver.page_source) using selenium, how can we do this with playwright, i tried response.replace(body=page.content()) but this doesn't work, kindly help!
Nice video, while i would suggest putting the sleep in to wait is a bad practice. For demonstrating something okay got it... While i'd suggest you would have a method that would use other functions available in PW to test the item/screen element is visible... Sorry don't know python but the pseudo code would be: public boolean SeekButtonAndClick( page, locatorString ) { let elementToFind = page.locator( locatorString) ; let screenHeight = elementToFind.boundingBox().height; let currentTop = -1; let windowTop = elementToFind.evaluate( node => window.top ) ; /* current display top */ while( elementToFind.isVisible( options.Timeout) is false and currentTop != windowTop ) { page.Wheel( 0, screenHeight) ; windowTop = elementToFind.evaluate( node => window.top ) ; } if (elementToFind.isVisible() ) { elementToFind.Click(); return true for success... } else { return false... } } Given the calling routine should have the page... Pass in the locator's string for the element that will come into view. We get the bindingbox of the screen Height and Width to know the height to scroll down, could send keyboard('page down') too. We then loop seeking to make sure the element is visible. The windowTop and currentTop prevent in an infinite loop either the control will scroll in and the position updates or not, if it does it will be clicked and return. Otherwise never found and that page won't scroll further return false. The evaluate should use the browser's window that will be the current view port of the browser window displayed... Scrolling down should increase it shouldn't be < 0, so we use -1 to ensure one execution. This has no explicit wait and should work well.
Thanks for the detailed response and you absolutely right, using time is a very bad idea and was used for demonstration purposes only. I would always recommend a wait on element or network use and an error handler for better code.
@@JohnWatsonRooney True, bet even my suggestion could be improved. I'm only worried, as a contractor people see the time and just put that in code and move on... It works... Difference is like one project i had people put in timer/waits with them took 1.5 hours, waiting on elements as we both agree the way to go, it now runs in about 7 minutes :)
Hi sir , can you make a video how to recaptcha bypass in airline website. Eg recaptcha inside the home page in bottom in right or left corner., and seen in monitoring request
I'm trying to automate saving of a video clip via a web UI that gets clips from a server. The mpeg video doesn't load in chromium. Is there a way to record the code using chrome?
@@JohnWatsonRooney Thanks for your reply. It worked after a windows restart and your method also worked for the project im working on. You're a great teacher!
@@Analyse_US I tried, seems that we are limited to screen size, but there is a workaround, you can modify the zoom level (10% or less), it is like having a huge browser window
Hi Thanks for sharing this In your previous vid I had commented about an 30 minute to hour long python crashcourse I’m aware that there are quite a lot of videos out there but I feel like your way of explaining is very clear and concise As for playwright I really like the video and would love to see more of this
Thanks I really appreciate your feedback, I've started writing down ideas for a Python crash course, I think if I choose the right topics it will work.
Great video, but I've a question about the run function. I've never seen a python function taking a dictionary-like argument (playwright: Playwright) What does it do? Also what does -> None do?
Pandas has functions that take dictionary args, for renaming column headers. You pass a dict as an arg with the current header as a key & what you want it to be renamed as the value...
"I've developed a web application using Playwright for automation testing. Could you provide guidance on how I can reliably deploy this application to a production environment, ensuring it runs smoothly and securely? What best practices or strategies would you recommend for deploying a Playwright-based application to a production server or hosting platform?" The key points in this question are: You have developed a web application using Playwright for automation testing. You need guidance on how to deploy this application to a production environment. You want the application to run smoothly and securely in production. You're looking for best practices or strategies for deploying a Playwright-based application to a production server or hosting platform. Please let me know if you would like me to modify or expand the question in any way. I'm happy to provide an English language question that captures the essence of what you need to ask the RUclips user.
It's a game-changer! Imagine how much time we save by this feature. Thank you so much John as always.
Brilliant, I love that John seems so non-chalant while dropping a game-changing feature. Thanks for sharing!!!
Amazing video!
I really like all of your videos and consider you my most favorite code tutor, you have a very understandable and hardworked content.
Thank you very much for your great journey on webscraping world, i will always awe you for the knowledge you share and efforts you give to find them.
Thanks truly.
Great to hear! Thank you, very kind
This makes my brain want to explode! This is amazing, so many potential applications. Thanks!
Super useful session. Great example of creating good and not a flaky scenario. Thank you very much.
Thanks john for covering about scroll down and up
Hope it helps!
Im from indonesian.. really good information... still wait to scrape data with this method.. good bless for you.. you save my time🥰
Thank you for watching I’m glad you liked it!
Thanks John 👍👍
The website has changed and while I couldn't follow your example exactly I did use the codegen and extracted the total price of the cart. Couldn't use page.query_selector and instead used:
page.frame_locator("#iFrameResizer0").locator('div[data-test=cart-total] span[data-test=cart-price-value]').text_content()
Seems a bit messy and can understand why it might be easier to get it to BeautifulSoup.
Thanks again. Great video and great tool.
That’s great! I find this sort of method, especially using codegen is best for automating tasks rather than information gathering - which I much prefer to do other ways like you mentioned
Like always John great content !
Can you create a video show best way to prevent site detecting you from using scrapping bot? I try a project before using different way to get Zillow data for project, but keep pop up captcha.
What is `Zillow data`?
For the rough idea
You can use existing chrome/Firefox browser with remote defined port so that you can do your work after captache
I did the same when I was working with selenium and the same bot detection issue I was facing so that existing browser use helped a lot to me🙂
Very fancy. Thanks for the video!
Glad you enjoyed it!
Hey man, can you do a video about scraping with asynchronous playwright?
Hey yeah sure i have a few more videos on my playwright series I’ll be sure to cover async
Looking forward to it. Planning to ditch Selenium.
What is `Scraping`?
I think that playwright is a nice automation tool but the documentation and the amount of tutorial are kinda scarce when we talk about Python with Pytest and playwright.
Also hard to find tutorials about page object model witb python/playwright.
Basically playwright is new and shiny but comes at a cost.
The developers of playwright mostly work in JavaScript, which I don't use.
What`s the difference between Python and Pytest?
@@markcuello5 Python is a programming language.
Pytest is a testing framework in Python.
Are there any reasons now to use selenium at all? I mean it was useful only if you need to click something or input, but playwright can do that now.
I don’t have any reason to use selenium anymore. It does do the same thing but playwright feels better to me, and easier to use too.
Selenium is still the de facto automation tool.
Playwright might be the new kid on the block but Selenium is still solid.
thumbs up and subscribed, thank you for your videos
Nice. Didn't knew playwright had that feature!
How can we generate code for device like iphone
Ok yeah you just talked me in to switching to playwright lol
I’m really enjoying working with it!
How connect to existing browser? I want that 2 scripts use 1 browser with 2 tabs.
Great video! Thank you for sharing! How does playwright handle waits?
Thanks! You can use the “is visible” on an element and it will wait until that specific element is shown
How to not open codegen in incognito mode?
Hi how can we do this with playwright or if is it even possible? in Scrapy response.replace(body=webdriver.page_source) using selenium, how can we do this with playwright, i tried response.replace(body=page.content()) but this doesn't work, kindly help!
Nice video, while i would suggest putting the sleep in to wait is a bad practice. For demonstrating something okay got it... While i'd suggest you would have a method that would use other functions available in PW to test the item/screen element is visible... Sorry don't know python but the pseudo code would be:
public boolean SeekButtonAndClick( page, locatorString )
{
let elementToFind = page.locator( locatorString) ;
let screenHeight = elementToFind.boundingBox().height;
let currentTop = -1;
let windowTop = elementToFind.evaluate( node => window.top ) ; /* current display top */
while( elementToFind.isVisible( options.Timeout) is false and
currentTop != windowTop )
{
page.Wheel( 0, screenHeight) ;
windowTop = elementToFind.evaluate( node => window.top ) ;
}
if (elementToFind.isVisible() )
{
elementToFind.Click();
return true for success...
}
else
{
return false...
}
}
Given the calling routine should have the page... Pass in the locator's string for the element that will come into view.
We get the bindingbox of the screen Height and Width to know the height to scroll down, could send keyboard('page down') too. We then loop seeking to make sure the element is visible. The windowTop and currentTop prevent in an infinite loop either the control will scroll in and the position updates or not, if it does it will be clicked and return. Otherwise never found and that page won't scroll further return false. The evaluate should use the browser's window that will be the current view port of the browser window displayed... Scrolling down should increase it shouldn't be < 0, so we use -1 to ensure one execution.
This has no explicit wait and should work well.
Thanks for the detailed response and you absolutely right, using time is a very bad idea and was used for demonstration purposes only. I would always recommend a wait on element or network use and an error handler for better code.
@@JohnWatsonRooney True, bet even my suggestion could be improved. I'm only worried, as a contractor people see the time and just put that in code and move on... It works... Difference is like one project i had people put in timer/waits with them took 1.5 hours, waiting on elements as we both agree the way to go, it now runs in about 7 minutes :)
wouldn't it be better to use data-testIDs instead of simulating the scrolling behavior?
Hi sir , can you make a video how to recaptcha bypass in airline website. Eg recaptcha inside the home page in bottom in right or left corner., and seen in monitoring request
I'm trying to automate saving of a video clip via a web UI that gets clips from a server. The mpeg video doesn't load in chromium. Is there a way to record the code using chrome?
Great video! What did you do with those 3 jackets? 😅
How do we make it run multiple time?
brilliant, Thanks for sharing this video
Thanks for your video! question: can we use toBeVisible() method instead of click(), when we are checking presence of element in the page?
Is there any alternative in puppeteer
I am very gratefull to you for this video. Could you please make a video how can I upload this code in a server??
Thanks again.
is this also working on react?
Hi John! A very big fan here.
Very helpfull information, but what if there’s a captcha that we need to break?
This is so awesome, but in my case "playwright codegen" doesn't do anything and I have no idea why :(
hmm not sure sorry, I've not had any issues before
@@JohnWatsonRooney Thanks for your reply. It worked after a windows restart and your method also worked for the project im working on. You're a great teacher!
@@liviud3d604 Great stuff! and thank you!
No doubt it is a great feature. It will make our life easy. 👌💖
So powerful. I really appreciate it 🙏🏼🙏🏼
thank you great video as always. To avoid scolling issues is it possible to set the height of browser window to a very big value ? (2000, 4000....)
I think that is possible but I haven’t tried to see what the limit is!
That was my thought as well. Don't suppose you tested the idea? Super curious
@@Analyse_US I tried, seems that we are limited to screen size, but there is a workaround, you can modify the zoom level (10% or less), it is like having a huge browser window
@@vincentdigiusto9429 Thanks!
Hi Thanks for sharing this
In your previous vid I had commented about an 30 minute to hour long python crashcourse
I’m aware that there are quite a lot of videos out there but I feel like your way of explaining is very clear and concise
As for playwright I really like the video and would love to see more of this
Thanks I really appreciate your feedback, I've started writing down ideas for a Python crash course, I think if I choose the right topics it will work.
Great video, but I've a question about the run function. I've never seen a python function taking a dictionary-like argument (playwright: Playwright)
What does it do? Also what does -> None do?
Pandas has functions that take dictionary args, for renaming column headers. You pass a dict as an arg with the current header as a key & what you want it to be renamed as the value...
Your head curve is like that or is it the hairs ?
They create
Can we use proxy's ?
Yes you can, the proxy goes within the browser launch command
"I've developed a web application using Playwright for automation testing. Could you provide guidance on how I can reliably deploy this application to a production environment, ensuring it runs smoothly and securely? What best practices or strategies would you recommend for deploying a Playwright-based application to a production server or hosting platform?"
The key points in this question are:
You have developed a web application using Playwright for automation testing.
You need guidance on how to deploy this application to a production environment.
You want the application to run smoothly and securely in production.
You're looking for best practices or strategies for deploying a Playwright-based application to a production server or hosting platform.
Please let me know if you would like me to modify or expand the question in any way. I'm happy to provide an English language question that captures the essence of what you need to ask the RUclips user.
Wow
lovely!
Great 👍👍
Thank you!
nice
RIP Selenium
I certainly don’t think I’ll use it again
@@JohnWatsonRooneyso far, playwright exists as open source because of selenium.
Help me
Awesome
Wtf this is amazing🥵
3:53 pee stain