Hello! As I want to do a personal project for my portfolio ( as im trying to get my first data scientist job) with the nba theme that i became recently a big fan of, I wanted to do the project from zero, which means scrap. The thing is, scraping was the only thing that i had zero knowledge. I found this video that is absolute pure gold. Im on windows, so i had to use sync mode, and changed a few things. Its working! I also tried to impersonate a little things and I commented the whole code. I'd love to get in touch with you, for some insights from now on, so the project is not a copy of yours, per say. thank you for the video, these kind of knownledge is much needed! Cheers from Brazil.
Hey! Thanks for amazing tutorial. I can't understand one thing. All these features we are preparing for the ML model to train on, however if we want to predict future games these features wont be available. So what will be the inputs for the potential trained model?
If you are coming from this with some knowledge about basketball, the "standings" mentioned are not actually standings, but the game schedule for that month. It was throwing me off a bit when continuously referenced...not sure it bothers anyone else but thought it worth mentioning.
28:50 How can you run await outside of a function? I don't really use jupyter. I tried something like z = [await scrape_season(x) for x in SEASONS], scrape_season(z) but neither worked. Any help appreciated
You can use await inside Jupyter notebook since everything in Jupyter is already running inside an async event loop. I would recommend stripping out async if you're writing a regular Python script outside of Jupyter. You'll use the Playwright sync api (instead of the async api). You'll have to replace the import of playwright with `from playwright.sync_api import sync_playwright, TimeoutError as PlaywrightTimeout`. Then you'll need to remove all the `async` and `await` keywords in the code, and write `with sync_playwright() as p:` inside the `get_html` function. This will remove the need for async entirely. But it won't work with Jupyter notebook, only with a regular python file.
@@Dataquestio I tried this and got: " Error: It looks like you are using Playwright Sync API inside the asyncio loop. Please use the Async API instead. " As far as I can tell I'm not using an asyncio loop anymore since I made the changes you mentioned.
If you're running in Jupyter, you need to use async (like in the video). If you're writing a regular .py file and running from the command line, then you can use the sync api like I mentioned in the comment above. You wouldn't get the error that you shared if you're running a regular python script (create a `x.py` file, run using `python x.py` from the command line). Jupyter by default wraps code in an asyncio loop. So anything you run in Jupyter is already running async!
Nice tutorial! I’m still waiting for the data to download. Impressive if you actually did the entire project with Jupyter Notebooks. I had the Windows Playwright issue everyone is talking about, so I used Pycharm. Ran out of memory so I had to run from the command line. Curious. Why did you make an opp column? You had rows of the same data without it, no?
I'm trying to scrape with playwright but PlaywrightTimeout isn't working and I keep getting invalid syntax. Cell In[5], line 13 except PlaywrightTimeout:
Hey i keep getting error (next(iter(done)).result()) Also in 3rd line of code yours highlights playwright.async_api in blue. Mine doesnt is that an issue too. Please help
Thanks for the video i follow all your work. the issue i am having is continuous timeout error when trying to scrape the data and ideas to get around it?
Hi! Trying to get 'line score' table, but without success. Table not found Selenium method doesn't approach here, because it takes a lot of time + I scare my laptop will bloom. Did someone meet the same problem and solve it?
Thanks for the detailed explanation. Since I'm on Windows, I couldn't use Jupyter to run the code so I've been trying your first option of using a Python IDE (I'm using PyCharm). I imported "from playwright.sync_api import sync_playwright" and eliminated the "async" and "await" keywords throughout the code. I was able get all the standings pages (after a few timeouts) and was getting excited with the success! But am having issues with the boxscore pages. The code starts with the April 2016 Standings file and is able to successfully save three of the boxscore files but will start timing out on the fourth one and eventually throw this error..."UnicodeEncodeError: 'charmap' codec can't encode character '\u010d' in position 38876: character maps to " When it does, the related .html file is blank. Firefox seems to work a little better than chrome as it doesn't timeout as often. Any idea of how to get this to work?
add encoding UTF-8 into the line " with open(save_path, "w+", encoding="utf-8") as f: f.write(html) " that is at the end of the scrape_game function, hope it helps.
Trying on both Jupyter and PyCharm and getting the same error on the parse_data part. When running it, it throws a ValueError with ----> 6 line_score = read_line_score(soup) In the box_scores loop, tracing back to 1 def read_line_score(soup): ----> 2 line_score = pd.read_html(str(soup), attrs = {'id': 'line_score'})[0] Ending message is "ValueError: No tables found" Have checked and double checked the code, including running the version in github, but no way to get it to work. Any ideas? Thank you for an excellent tutorail.
Hi, during the parsing part, when I run the code till if len(games) % 100 == 0: print(f"{len(games)} / {len(box_scores)}"), it keeps telling me the error: html5lib not installed, even if I have installed it myself. Could you help me with it?
Help!.. I couldn't instantiate the browser in the "get_html" function, I already changed p.firefox.launch() to p.chromium.launch()... is it necessary to execute any previous command to install the browsers for the library "playwright"..?
I showed it in the video - you need to run `playwright install` in the command line, or `!playwright install` in jupyter notebook to install the browsers.
Hi I keep getting Notimported error when trying to do this project in Windows """Create subprocess transport.""" --> 524 raise NotImplementedError Could someone help me? How to I correct it? I'm running on Windows and vscode
Awesome stuff!! I am looking to parse box scores for player data. I would like to get player stats AVG and ideally get AVG for Opponent Defensive stats . Could you suggest next steps?
You want average stats for players? I want to do something similar. Well I want to get moving averages of players, so that I will predict their points scored in the next games.
Hello! I am doing a similar project but for the NFL. Firstly, is it okay to scrape data from the football reference website, their T&C's are rather unclear. Also, Unlike the basketball reference website where you have to iterate through the months to get all the games, you do not have to do this on the football reference website, therefore, I am wondering how I would have to amend that part of the code. Any help would be very much appreciated as this is for my Final Year Project (Dissertation) at university. Have a great day one and all.
Never could get Chromium to work. I looked everywhere to find a solution for a very long time. So I ended up using Firefox as well. Does anyone have a solution to the chromium issue to direct to me. I really want to figure that out. Great job with the video! Very intuitive. Wish more content was on the regular.
Hello ! Thanks for the great tutorial. I am an NBA fan and data nerd myself, and was wondering why you did not make use of the 'nba_api' to get the most up to date data of game ? And if someone does use it, is there a way to build a ticker to predict the win probability of your favorite team(s) next game in an ongoing season ?? Thanks again for the great content !
I tried to replicate it but it didn't work for some reason. When I call the get_html function I get a not implemented error but it doesn't say anything. Nice tutorial though
NOTE FOR ANYONE GETTING NOT IMPLEMENTED ERROR:::::: you need to download wsl and run the jupyter lab using that. enter directory where you got your lab, type wsl in cmd prompt, then do jupyter lab. you may have to do something to make the jupyter lab open automatically on wsl but if you don't care you can just copy paste the address it gives into chrome and itll open. windows doesn't support async playright hence the error
code gives this result "NotImplementedError: Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings..."
Trying to re-run just the get_data.ipynb in Jupyter Lab on my local machine. Have changed .p.chromium.launch() to p.firefox.launch() in get_html() and am still getting the "Timeout error on {url}" when I run `for season in SEASONS: await scrape_season(season)` Any tips?
Update: couple of nifty tricks to get this part to work. Change `retries` in get_html to at least 5, and you will probably still run into the timeout issue which causes either BeautifulSoup or f.write(html) to error out, so what you need to do is keep running the code over and over again, and keep an eye on the standings directory. As it populates with each season's month's htmls, modify the seasons variables to exclude those years (e.g. change it from SEASONS = list(range(2016,2024)) to SEASONS = list(range(2017,2024)) and keep iterating up that lower bound as needed).
Hello, thanks for this amazing tuto. Anyone else had an error while installing playwright ? Me i got the "playwright is not recognized as an internal or external error message" both in command line or in Jupyther notebook. Can anybody help me please ?
You would need to run `pip install playwright` in the command line, or `%pip install playwright` in Jupyter. (remove the `, that's just to show which part is the command).
@@Dataquestio I did it, but the '!playwright install' failled and i don't know why (You said that we must run this also in jupyther or command line . This is the error message i got : 'playwright' is not recognized as an internal or external command, operable program or batch file. Another request can you add the current season results in the CSV files availlable in the project files ? Please
It looks like there is an issue with playwright and Jupyter on certain versions of Windows/Python (see issue at github.com/scrapy-plugins/scrapy-playwright/issues/7 ). Your options: * Put the code into a regular `.py` file and run it as a python script (not in Jupyter notebook) (easiest) * Install windows subsystem for linux and run jupyter notebook using wsl * Try to upgrade your version of Python/Jupyter and see if that works
using windows 10, VSC, i get: html = await get_html(url, "#content .filter") SyntaxError: 'await' outside function. we cant make html a global variable or put it in the function huh? can we use something else besides playwright? 😞
If you write your code in a regular python file (no Jupyter notebook), then you can use the Playwright sync api (instead of the async api). You'll have to replace the import with `from playwright.sync_api import sync_playwright, TimeoutError as PlaywrightTimeout`. Then you'll need to remove all the `async` and `await` keywords in the code, and write `with sync_playwright() as p:` inside the `get_html` function. This will remove the need for async entirely. But it won't work with Jupyter notebook, only with a regular python file.
@@Dataquestio I wiill try this when I get home later, a few questions if you see this. Are you using windows? and also would you recommend I just substitute with selenium or as you said run it as a python script. Thanks for the content, love the channel
I keep getting the below after running the for season in SEASONS loop. I'm writing it in regular python script, vice Jupyter in case that's a factor. playwright._impl._api_types.Error: NS_ERROR_UNKNOWN_HOST
ask exception was never retrieved future: Traceback (most recent call last): File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\site-packages\playwright\_impl\_connection.py", line 224, in run await self._transport.connect() File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\site-packages\playwright\_impl\_transport.py", line 133, in connect raise exc File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\site-packages\playwright\_impl\_transport.py", line 121, in connect self._proc = await asyncio.create_subprocess_exec( File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\asyncio\subprocess.py", line 218, in create_subprocess_exec transport, protocol = await loop.subprocess_exec( File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 1652, in subprocess_exec transport = await self._make_subprocess_transport( File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 493, in _make_subprocess_transport raise NotImplementedError NotImplementedError kindly help me above error
Hello!
As I want to do a personal project for my portfolio ( as im trying to get my first data scientist job) with the nba theme that i became recently a big fan of, I wanted to do the project from zero, which means scrap. The thing is, scraping was the only thing that i had zero knowledge.
I found this video that is absolute pure gold. Im on windows, so i had to use sync mode, and changed a few things. Its working! I also tried to impersonate a little things and I commented the whole code. I'd love to get in touch with you, for some insights from now on, so the project is not a copy of yours, per say.
thank you for the video, these kind of knownledge is much needed! Cheers from Brazil.
Glad it helped you!
Hey! Thanks for amazing tutorial. I can't understand one thing.
All these features we are preparing for the ML model to train on, however if we want to predict future games these features wont be available. So what will be the inputs for the potential trained model?
If you are coming from this with some knowledge about basketball, the "standings" mentioned are not actually standings, but the game schedule for that month. It was throwing me off a bit when continuously referenced...not sure it bothers anyone else but thought it worth mentioning.
28:50 How can you run await outside of a function? I don't really use jupyter. I tried something like z = [await scrape_season(x) for x in SEASONS], scrape_season(z) but neither worked. Any help appreciated
You can use await inside Jupyter notebook since everything in Jupyter is already running inside an async event loop.
I would recommend stripping out async if you're writing a regular Python script outside of Jupyter. You'll use the Playwright sync api (instead of the async api). You'll have to replace the import of playwright with `from playwright.sync_api import sync_playwright, TimeoutError as PlaywrightTimeout`.
Then you'll need to remove all the `async` and `await` keywords in the code, and write `with sync_playwright() as p:` inside the `get_html` function. This will remove the need for async entirely. But it won't work with Jupyter notebook, only with a regular python file.
@@Dataquestio I tried this and got:
"
Error: It looks like you are using Playwright Sync API inside the asyncio loop.
Please use the Async API instead.
"
As far as I can tell I'm not using an asyncio loop anymore since I made the changes you mentioned.
If you're running in Jupyter, you need to use async (like in the video). If you're writing a regular .py file and running from the command line, then you can use the sync api like I mentioned in the comment above.
You wouldn't get the error that you shared if you're running a regular python script (create a `x.py` file, run using `python x.py` from the command line).
Jupyter by default wraps code in an asyncio loop. So anything you run in Jupyter is already running async!
@@keithravid5235 forwarding the message because he replied directly to me so you won't see it. check above/below.
Nice tutorial. I am just curios what is the purpose of opening a browser with playwright. Why not just use the requests library to get the html?
a new sub is gained, thank you for this tutorial!
Nice tutorial! I’m still waiting for the data to download.
Impressive if you actually did the entire project with Jupyter Notebooks.
I had the Windows Playwright issue everyone is talking about, so I used Pycharm. Ran out of memory so I had to run from the command line.
Curious. Why did you make an opp column? You had rows of the same data without it, no?
I need help!! Can I email you with an error message I keep getting?
I'm trying to scrape with playwright but PlaywrightTimeout isn't working and I keep getting invalid syntax.
Cell In[5], line 13
except PlaywrightTimeout:
SyntaxError: invalid syntax
Hey i keep getting error (next(iter(done)).result())
Also in 3rd line of code yours highlights playwright.async_api in blue. Mine doesnt is that an issue too. Please help
Just noticed your responses below, I'll have to try the code as a Pycharm file and see how I get on
Thanks for the video i follow all your work. the issue i am having is continuous timeout error when trying to scrape the data and ideas to get around it?
Hi! Trying to get 'line score' table, but without success. Table not found
Selenium method doesn't approach here, because it takes a lot of time + I scare my laptop will bloom. Did someone meet the same problem and solve it?
Thanks for the detailed explanation. Since I'm on Windows, I couldn't use Jupyter to run the code so I've been trying your first option of using a Python IDE (I'm using PyCharm). I imported "from playwright.sync_api import sync_playwright" and eliminated the "async" and "await" keywords throughout the code. I was able get all the standings pages (after a few timeouts) and was getting excited with the success! But am having issues with the boxscore pages. The code starts with the April 2016 Standings file and is able to successfully save three of the boxscore files but will start timing out on the fourth one and eventually throw this error..."UnicodeEncodeError: 'charmap' codec can't encode character '\u010d' in position 38876: character maps to " When it does, the related .html file is blank. Firefox seems to work a little better than chrome as it doesn't timeout as often. Any idea of how to get this to work?
add encoding UTF-8 into the line
" with open(save_path, "w+", encoding="utf-8") as f:
f.write(html) "
that is at the end of the scrape_game function, hope it helps.
@@nemanjatamindzija58 Thank you, i was having the same problem
Hi! Can you share the final code that you have please? For this project becuase I have the same problem of windows @garymichalske2274
Trying on both Jupyter and PyCharm and getting the same error on the parse_data part.
When running it, it throws a ValueError with
----> 6 line_score = read_line_score(soup)
In the box_scores loop, tracing back to
1 def read_line_score(soup):
----> 2 line_score = pd.read_html(str(soup), attrs = {'id': 'line_score'})[0]
Ending message is "ValueError: No tables found"
Have checked and double checked the code, including running the version in github, but no way to get it to work.
Any ideas?
Thank you for an excellent tutorail.
got the same problem going on. Have you figured it out, bro?
@@jerryli2276 Not really. Started doing some different stuff to learn python, forgot about this project, never went back.
beautiful project, thank you.
Hi, during the parsing part, when I run the code till if len(games) % 100 == 0:
print(f"{len(games)} / {len(box_scores)}"), it keeps telling me the error: html5lib not installed, even if I have installed it myself. Could you help me with it?
Hi!!
I have one query.
Why did we take the max of each stat? What is the purpose behind it?
How is Playwright different from BeautifulSoup which also grabs HTML from website pages?
Help!.. I couldn't instantiate the browser in the "get_html" function, I already changed p.firefox.launch() to p.chromium.launch()... is it necessary to execute any previous command to install the browsers for the library "playwright"..?
I showed it in the video - you need to run `playwright install` in the command line, or `!playwright install` in jupyter notebook to install the browsers.
Hi I keep getting Notimported error when trying to do this project in Windows """Create subprocess transport."""
--> 524 raise NotImplementedError
Could someone help me? How to I correct it? I'm running on Windows and vscode
Awesome stuff!! I am looking to parse box scores for player data. I would like to get player stats AVG and ideally get AVG for Opponent Defensive stats . Could you suggest next steps?
You want average stats for players? I want to do something similar. Well I want to get moving averages of players, so that I will predict their points scored in the next games.
@@FlisB did you ever figure out how to do this?
I can't get all the data to scrape
any suggestions??!!
Hello! I am doing a similar project but for the NFL. Firstly, is it okay to scrape data from the football reference website, their T&C's are rather unclear. Also, Unlike the basketball reference website where you have to iterate through the months to get all the games, you do not have to do this on the football reference website, therefore, I am wondering how I would have to amend that part of the code. Any help would be very much appreciated as this is for my Final Year Project (Dissertation) at university. Have a great day one and all.
Never could get Chromium to work. I looked everywhere to find a solution for a very long time. So I ended up using Firefox as well. Does anyone have a solution to the chromium issue to direct to me. I really want to figure that out. Great job with the video! Very intuitive. Wish more content was on the regular.
Hello ! Thanks for the great tutorial. I am an NBA fan and data nerd myself, and was wondering why you did not make use of the 'nba_api' to get the most up to date data of game ?
And if someone does use it, is there a way to build a ticker to predict the win probability of your favorite team(s) next game in an ongoing season ??
Thanks again for the great content !
The NBA api might only be for private use. I know the NFL api is.
@@AS-rg9ly it is not. I have tried it myself.
getting a NotImplementedError, any help?
see above comment of mine if my last comment didnt go thru, its not showing it for some reason
How do you get past the cookie wall. I can't download the proper HTML because of cookies
I tried to replicate it but it didn't work for some reason. When I call the get_html function I get a not implemented error but it doesn't say anything. Nice tutorial though
i’m having trouble installing playwright can anyone help?
This whole thing is not working. Tried like a thousand times, kept getting the same error. I can send screenshot if possible
Hello boss, is it normal to run the scrape season a few times to gather all the data if some of them timeout? Thank you for your time.
I had this issue, i upped the request,retries and timeout time and I have all the data now.
@@nicksteele6578 thanks homie
hello, thanks for the tutorial!
NOTE FOR ANYONE GETTING NOT IMPLEMENTED ERROR::::::
you need to download wsl and run the jupyter lab using that. enter directory where you got your lab, type wsl in cmd prompt, then do jupyter lab. you may have to do something to make the jupyter lab open automatically on wsl but if you don't care you can just copy paste the address it gives into chrome and itll open.
windows doesn't support async playright hence the error
code gives this result "NotImplementedError:
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings..."
I have the same problem, did you manage to solve it?
Also getting this error
need to run jupyter on wsl if youre using jupyter on windows. playwright async doesnt work with windows
me too
any chance you solved
Trying to re-run just the get_data.ipynb in Jupyter Lab on my local machine. Have changed .p.chromium.launch() to p.firefox.launch() in get_html() and am still getting the "Timeout error on {url}" when I run
`for season in SEASONS:
await scrape_season(season)`
Any tips?
Update: couple of nifty tricks to get this part to work. Change `retries` in get_html to at least 5, and you will probably still run into the timeout issue which causes either BeautifulSoup or f.write(html) to error out, so what you need to do is keep running the code over and over again, and keep an eye on the standings directory. As it populates with each season's month's htmls, modify the seasons variables to exclude those years (e.g. change it from SEASONS = list(range(2016,2024)) to SEASONS = list(range(2017,2024)) and keep iterating up that lower bound as needed).
same here
Hello, thanks for this amazing tuto. Anyone else had an error while installing playwright ? Me i got the "playwright is not recognized as an internal or external error message" both in command line or in Jupyther notebook. Can anybody help me please ?
You would need to run `pip install playwright` in the command line, or `%pip install playwright` in Jupyter. (remove the `, that's just to show which part is the command).
@@Dataquestio I did it, but the '!playwright install' failled and i don't know why (You said that we must run this also in jupyther or command line . This is the error message i got : 'playwright' is not recognized as an internal or external command,
operable program or batch file.
Another request can you add the current season results in the CSV files availlable in the project files ? Please
the program shows NotImplementedError after executing "html = await get_html....." and I have done every step as you have shown
It looks like there is an issue with playwright and Jupyter on certain versions of Windows/Python (see issue at github.com/scrapy-plugins/scrapy-playwright/issues/7 ).
Your options:
* Put the code into a regular `.py` file and run it as a python script (not in Jupyter notebook) (easiest)
* Install windows subsystem for linux and run jupyter notebook using wsl
* Try to upgrade your version of Python/Jupyter and see if that works
@@Dataquestio please i'm having the same issue and im using window
using windows 10, VSC, i get: html = await get_html(url, "#content .filter")
SyntaxError: 'await' outside function. we cant make html a global variable or put it in the function huh? can we use something else besides playwright? 😞
If you write your code in a regular python file (no Jupyter notebook), then you can use the Playwright sync api (instead of the async api). You'll have to replace the import with `from playwright.sync_api import sync_playwright, TimeoutError as PlaywrightTimeout`.
Then you'll need to remove all the `async` and `await` keywords in the code, and write `with sync_playwright() as p:` inside the `get_html` function. This will remove the need for async entirely. But it won't work with Jupyter notebook, only with a regular python file.
@@Dataquestio I wiill try this when I get home later, a few questions if you see this.
Are you using windows? and also would you recommend I just substitute with selenium or as you said run it as a python script.
Thanks for the content, love the channel
SyntaxError: 'await' outside function
same here, I scraped with requests and it works...
Please make a python crashcourse
cool
I keep getting the below after running the for season in SEASONS loop. I'm writing it in regular python script, vice Jupyter in case that's a factor.
playwright._impl._api_types.Error: NS_ERROR_UNKNOWN_HOST
did you solve it?
ask exception was never retrieved
future:
Traceback (most recent call last):
File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\site-packages\playwright\_impl\_connection.py", line 224, in run
await self._transport.connect()
File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\site-packages\playwright\_impl\_transport.py", line 133, in connect
raise exc
File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\site-packages\playwright\_impl\_transport.py", line 121, in connect
self._proc = await asyncio.create_subprocess_exec(
File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\asyncio\subprocess.py", line 218, in create_subprocess_exec
transport, protocol = await loop.subprocess_exec(
File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 1652, in subprocess_exec
transport = await self._make_subprocess_transport(
File "C:\Users\manjunatha.reddy\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 493, in _make_subprocess_transport
raise NotImplementedError
NotImplementedError
kindly help me above error
I also got the same error. What's wrong?
Task exception was never retrieved
future:
Traceback (most recent call last):
File "C:\Users\JU HEE RYONG\anaconda3\lib\site-packages\playwright\_impl\_connection.py", line 224, in run
await self._transport.connect()
File "C:\Users\JU HEE RYONG\anaconda3\lib\site-packages\playwright\_impl\_transport.py", line 133, in connect
raise exc
File "C:\Users\JU HEE RYONG\anaconda3\lib\site-packages\playwright\_impl\_transport.py", line 121, in connect
self._proc = await asyncio.create_subprocess_exec(
File "C:\Users\JU HEE RYONG\anaconda3\lib\asyncio\subprocess.py", line 236, in create_subprocess_exec
transport, protocol = await loop.subprocess_exec(
File "C:\Users\JU HEE RYONG\anaconda3\lib\asyncio\base_events.py", line 1630, in subprocess_exec
transport = await self._make_subprocess_transport(
File "C:\Users\JU HEE RYONG\anaconda3\lib\asyncio\base_events.py", line 491, in _make_subprocess_transport
raise NotImplementedError
NotImplementedError
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
in
----> 1 html = await get_html(url, "#content .filter")
in get_html(url, selector, sleep, retries)
4 time.sleep(sleep * i)
5 try:
----> 6 async with async_playwright() as p:
7 browser = await p.firefox.launch()
8 page = await browser.new_page()
~\anaconda3\lib\site-packages\playwright\async_api\_context_manager.py in __aenter__(self)
44 if not playwright_future.done():
45 playwright_future.cancel()
---> 46 playwright = AsyncPlaywright(next(iter(done)).result())
47 playwright.stop = self.__aexit__ # type: ignore
48 return playwright
~\anaconda3\lib\site-packages\playwright\_impl\_connection.py in run(self)
222 self.playwright_future.set_result(await self._root_object.initialize())
223
--> 224 await self._transport.connect()
225 self._init_task = self._loop.create_task(init())
226 await self._transport.run()
~\anaconda3\lib\site-packages\playwright\_impl\_transport.py in connect(self)
131 except Exception as exc:
132 self.on_error_future.set_exception(exc)
--> 133 raise exc
134
135 self._output = self._proc.stdin
~\anaconda3\lib\site-packages\playwright\_impl\_transport.py in connect(self)
119 env.setdefault("PLAYWRIGHT_BROWSERS_PATH", "0")
120
--> 121 self._proc = await asyncio.create_subprocess_exec(
122 str(self._driver_executable),
123 "run-driver",
~\anaconda3\lib\asyncio\subprocess.py in create_subprocess_exec(program, stdin, stdout, stderr, loop, limit, *args, **kwds)
234 protocol_factory = lambda: SubprocessStreamProtocol(limit=limit,
235 loop=loop)
--> 236 transport, protocol = await loop.subprocess_exec(
237 protocol_factory,
238 program, *args,
~\anaconda3\lib\asyncio\base_events.py in subprocess_exec(self, protocol_factory, program, stdin, stdout, stderr, universal_newlines, shell, bufsize, encoding, errors, text, *args, **kwargs)
1628 debug_log = f'execute program {program!r}'
1629 self._log_subprocess(debug_log, stdin, stdout, stderr)
-> 1630 transport = await self._make_subprocess_transport(
1631 protocol, popen_args, False, stdin, stdout, stderr,
1632 bufsize, **kwargs)
~\anaconda3\lib\asyncio\base_events.py in _make_subprocess_transport(self, protocol, args, shell, stdin, stdout, stderr, bufsize, extra, **kwargs)
489 extra=None, **kwargs):
490 """Create subprocess transport."""
--> 491 raise NotImplementedError
492
493 def _write_to_self(self):
NotImplementedError:
@@주희룡 Same problem here
did you solve it?
@@주희룡 Yep, I got the same error. This is why programming can be so frustrating.
the same problem (