Tensorflow with GPU on Windows WSL using Docker
HTML-код
- Опубликовано: 17 июл 2022
- In this video we show you how to run Tensorflow with GPU on Windows using WSL (WSL2) and Docker. There are several steps that should be completed in order. However, the initial challenges are worth is. This is the best way to locally run machine learning tasks on windows.
Companion Article:
/ tensorflow-with-gpu-on...
If this video helped you out, be sure to like and subscribe for more content!
Links (in order of installation):
www.nvidia.com/Download/index...
docs.microsoft.com/en-us/wind...
docs.docker.com/get-docker/
hub.docker.com/r/tensorflow/t...
Master Tensorflow and Keras with the creator of Keras, François Chollet! Plus, you can help support KNuggies with this affiliate link 🤑:
www.manning.com/books/deep-le...
The repo containing the Dockerfile and docker-compose.yaml can be found here:
github.com/KNuggies/tensorflo...
Join this channel to get access to perks:
/ @knuggies
Follow on Social Media to keep up with the latest KNuggies posts:
/ knuggies3
Join the Discord server for anouncements:
/ knuggies3
All KNuggies Medium articles can be found here:
/ knuggies
Looking for more programming resources? Check out Manning Publications (Affiliate Link): www.manning.com/?K...
The companion Medium article can be found here:: medium.com/@knuggies/tensorflow-with-gpu-on-windows-with-wsl-and-docker-75fb2edd571f
Master Tensorflow and Keras with the creator of Keras, François Chollet! Plus, you can help support KNuggies with this affiliate link 🤑:
www.manning.com/books/deep-learning-with-python-second-edition?KNuggies&a_aid=KNuggies&a_bid=6e43a0f9
Amazing tutorial. Almost 1 year later, this works seamlessly.
I've been trying for over 8 hours today to get Tensorflow GPU working on Windows 11 to no avail..
Watching this, got it set up in 15 minutes.
Can't thank you enough, wish I would have found this sooner... Cheers!!
I used this tutorial today, Jan 2023. It still works and it is the ONLY tutorial about installation of TF on Windows with NVIDIA on YT that works! Good job! I also just got Google Coral Dev board to play with.
Best tutorial on the RUclips for using docker woth GPU support in windows through WSL. It worked just fine in January 2023
It is for sure the best video regarding docker usage on Windows and VS code. Congrats!
Really good tutorial. Helped me figure out how to get the GPU running!
Really good tutorial, all you need to start in less than seventeen minutes! Thanks!
Your tutorial was most valuable, as is the included repository. Thank you!
Great tutorial! Please keep teaching us all the good stuff ;)
I love running into these tutorials that have just the right amount of rigor to get you started.
Excellent - that told me exactly what I need to know - well done!
Fantastic tutorial! Everything was to the point and easy to follow.
This is very helpful. Thank you!
Amazing tutorial! Thanks
I had to struggle a bit for some unexpected error messages but once it was fixed, it worked very well ! Thanks :)
Were your error messages like this:
jupyter-lab-1 | /bin/sh: 1: [: jupyter,: unexpected operator
jupyter-lab-1 exited with code 2
Thanks for the tutorial!! :D
best video on RUclips, works in August 2023
Hey everyone I could get rid of the errors after installing CUDA on the WLS2 instance, thank you for the tutroial!
You are life saver!
Absolutely amazing tutorial.
I got so fed up with trying to make the latest version of tensorflow work with gpu that I almost just installed an old version that worked for me before.
But this is SO MUCH BETTER!
so easy to set up, works flawlessly, 10/10.
I only need to figure out my workflow with git and vscode to make it as easy as possible but hopefully that is the simple part.
Thanks you so much!
Thank you for the kind words! It probably requires a token login instead of username and password if you try to connect to your github account from wsl. Otherwise, using git with vscode in WSL isn't too bad.
one thing to note about the prerequisites is that windows has to be in version 21H2, if you get version 21H1, you can't access GPU inside WSL. took me 2 days to figure out :v
I feel your pain. I had the same issue. I installed 22H and voilà!
Perfecto!🤗
it's working!!!! finnaly. thank youuuuuuuuuuuuuuu
thanks
Tomorrow, I will try this
Awesome! Let me (and others) know if it still works.
@@KNuggies YES!! It's working :)
My configuration:
Windows 10
WSL 2 running Ubuntu-22.04
Using the docker image tensorflow/tensorflow:2.13.0-gpu (pushed to dockerHub on July 6 2023)
Nvidia Quadro M1200
Thank you very very much. I spent the whole day yesterday following the official steps in tensorflow pages, and it didn't work.
Just made my first Medium post on my personal page. It accompanies this tutorial well: medium.com/@jason_barhorst/tensorflow-with-gpu-on-windows-with-wsl-and-docker-75fb2edd571f
Had to change the code with "docker run -it --rm -p 8888:8888 --gpus all tensorflow/tensorflow:latest-gpu-jupyter" adding "/tensorflow" after first tensorflow. (At the link by the way, video is correct.) Cheers.
Thanks! I just fixed it.
Ty bro
and thank you for watching !
Thanks
Wow! Thanks for the Super!
Just fyi, you could use CuPy instead of Numpy if you want to make use of your GPU for normal Numpy related things. It is more or less swappable for most Numpy commands, so it’s easy to modify code between the two.
Interesting. I've heard of Dask for GPU accelerated Pandas, but didn't know about this one.
Great tutorial, thank you so much! One question: is there a way I can edit my notebook from Visual Studio Code or do I always need to use the explorer?
VS Code allows you to open notebooks just like any code file (maybe it will prompt for an addon). When you have the notebook running in VS code, you need to choose your notebook server. Just copy and paste the URL with token that is displayed when starting the jupyter lab / notebook server. If you don't choose your server and try to run everything with the default VS Code environment, you'll almost certainly run into problems or unexpected behavior.
Using VS Code with notebooks does make type hints and navigation better for VS Code users, but it's still missing something. Every time I try to switch to VS Code for notebooks, I find myself going back to the web interface. Try it out and see if you like it!
If you installed on wsl distributive such as ubuntu 20.04, you can face unexpected errors, when you will try open "new wsl window". I solved this problem by choosing an option: "New WSL Window using Distro" and then just selected my installed system from the dropdown
Thanks for sharing in case others have this issue!
Great video, thank you!
One question, would there be any performance difference if you install the CUDA drivers, tensor flow libraries and run the jupyter lab directly from the WSL command line, instead of running everything inside a docker container that runs over the WSL?
Great question! Wish I knew the architecture of each well enough to answer it. Might have to test that to get some time comparisons. My WSL is about due for a fresh start anyways.
docker run -it --rm -p 8888:8888 --gpus all tensorflow/tensorflow:latest-gpu-jupyter
Brother tnx ❤️🫵
Thank you and it is a good tutorial. Will it need to install cudnn?
No need to install cudnn! Finding the right version of that to match the various ML packages was always the worst.
Hi It was a really nice tutorial! I could follow it until the end and use Tensorflow on Windows WSL and Docker. I am still new to Docker so I have two questions. 1. after we built the docker container, how do we change the container if we want to install a new python library 2. do we need to create new working directory, dockerfile, requirement.txt and docker-compose yaml when building a new container?
After modifying the Dockerfile and/or requirements.txt, just run "docker build -t container-name ." to rebuild the container.
I have had to clear the cache and delete the old container before seeing changes sometimes. Commands for that are just a google away. I don't think docker was detecting my changes to requirements.txt for that one, but not sure.
After building the container again, running "docker compose up" to restart the server with the new container image.
@@KNuggies thank you!! that's very clear to me!
Great video. Can I code in the container, using this tensorflow installation, in VS Code directly? When I try coding this untitled file (left panel of vs code) it says TF not installed but the jupyter lab environment works fine.
You can connect to the container and open a directory to program in with the VS Code Docker extension. It's the easiest way to do this. Just make sure the directory you code in is a volume shared with your OS if you want to save the code.
Hi, thank you for your tutorial, is there any way to use the TensorFlow container inside vscode (remote with WSL2) with python?
You want the Docker Extension in VS Code. Just right click on a running container from the Docker Extension to attach VS Code to the container.
Hi, I have set up the docker file and able to run the tensorflow in jupyterlab using the command "docker-compose up".
I wonder do I need to run the tensorflow by running the command everytime? Or there is a shortcut way (e.g., create a docker container or image?). Hopefully you can guide me through this.
Including the following two lines under your image in docker-compose.yaml should restart the container when you boot your system and the environment variable will define the token to access jupyter lab. That way you can set a bookmark if you want to have the same token for the notebook server every time you start it. Let me know how it goes.
restart: always
environment:
JUPYTER_TOKEN: "007e1fea040680a3ed5f46b11d61170e2f8132a3a3c45fde"
Absolute geat tutorial, thank you!
I have a propblem where I want to train on a large dataset saved on windows side. Is there a way that I can load the data from windows side to train on the jupyter server set up this way? (I have tried to modify COPY to copy the whole dataset along, but that is very slow and annoying).
You should be able to access your Windows drive in wsl under /mnt. For example you can get to the C drive using 'cd /mnt/c/' from a WSL terminal. In a notebook, you could use '/mnt/c/...' for the directory where your data is.
Hmm, thinking about it more, accessing /mnt/c from inside the container shouldn't work. You can use a volume in the docker-compose.yaml to link your directory to a directory inside the container. Something like:
services:
notebook-service-name:
image: ...
volumes:
- /mnt/c/data_directory:/tf-knugs/data
You can use whatever directories you need
Thanks again! It worked perfectly :)
Hello. i followed your tutorial and it worked very well at first, but once i turned off and turned on my computer it stopped detecting my gpu, even when the fist time it worked. idk what happened:(
it says that could not find drivers for cuda and the gpu will not be used
as of now when i install jupyterlab from the command in Dockerfile, the version between dependencies are incompatible, for example between nbclassic and notebook, between nbformat and jupyter-server :( can you check that out?
It could have been a conflict when trying to install both jupyter-notebooks and jupyter-lab on the same image. If you were still using the tensorflow jupyter image, you should start with just the tensorflow image that doesn't include jupyter notebooks. Something like "FROM tensorflow/tensorflow:latest-gpu".
This could also be related to the recent release of Python 3.11 and many dependencies getting ready for it. The easiest way to work around this would be to pin the base image and packages to specific versions that are compatible. For now, I have the versions unpinned so it always grabs the latest, but that does come with some risk. If you are planning to deploy code, always pin the versions so you can intentionally upgrade/test when you want to.
Hi, thanks so much for the tutorial!!
However, I encountered an error when trying to load E: directory in my code in the JupyterLab.
Looks like it is unable to locate the directory and files.
I didn't encounter the same problem if I just use VS code python.
Do you know how to solve it? Thanks!
Using windows directories in Docker and/or WSL can be tricky. From a WSL terminal, try "cd /mnt" and your Windows drives should be visible. If it's there, you'll end up using /mnt/d/... to specify where your files are.
@@KNuggies can you make a tutorial for pytorch? i have an issue when using tensorflow with librosa. so for an alternative, I want to use torchaudio. thanks!
I have a AMD ryzen processor So i don't have an extra GPU of nvidia as it is not compatible so Can I still use docker and WSL
Without a GPU, you can still run tensorflow, docker, WSL, etc. It will just take forever to train models. Anything outside the most simple models will basically never finish.
If I want to use Jupyter notebook instead of Lab, what changes should I make?
Check for a jupyter notebook base image on docker.hub. There should be some official options. Or just build up exactly what you want from a python base image and pip everything you need in the Dockerfile or using requirements.txt.
Does anyone know why the WORKDIR variable might fail to work? I set it to '/tf-project' and yet Jupyter opens in '/tf'.
Afraid I haven't run into this error. Did you have any luck resolving it?
It will works with a Ryzen 5900x if activate the Virtualization on the BIOS?
I don't think so. Unfortunately all the CUDA stuff is NVIDIA. Maybe someone has a work around, I haven't had an AMD card for ~5 years.
@@KNuggies No, no, i have a NVIDIA GPU (1080 TI) and a "CPU" Ryzen 5900x, what i mean is, i need to activate the Virtualization feature of the CPU for Docker can works?
@@ZeroCool22 That makes much more sense (I should have googled that). I kinda stopped watching the CPUs after the 3900 series because that was the last time I was in the market. I'm on a 3950X with virtualization enabled in the bios and it works fine.
When ı run docker-compose up at the 14th minute, ı received no configuration file provided: not found. How can ı solve this problem
Please make sure you are running the command from the same directory containing your docker-compose.yaml file. Hope this resolves it.
How Ican I install another python library (like scikit-learn) after the first docker-compose up run?
I typically "docker-compose down" to delete the containers built. Then modify your requirements.txt (or add additional pip installs to the Dockerfile). Then rerun docker build with the --no-cache option before your next "docker-compose up".
You could also pip install from inside the container or from the notebook using "!" to start the command, but the changes will probably not persist with the container.
@@KNuggies Thank you very much! Amazing video!!
can this be used for AMD gpus as well?
This tutorial won't work for AMD GPUs. Unfortunately everything CUDA is NVIDIA. I haven't had an AMD card for ~5 years and don't really have a way to try some of the options out there, but checkout tensorflow-rocm. It might help you find a way to use AMD cards.
Thanks for your great tutorial, but i have a problem, when i am trying to build after downloading couple of mb it restarts to 0mb and after some minutes it give 2 Error:
1_Faild to copy: local error:tls bad record MAC
2_ service Jupyter-lab failed to build : Build failed
Also i should mention that i set my Yaml version:”2.2” because if i set that to 1 it gives version error .
Thanks for your help
It's worth noting that version in docker-compose is obsolete, so you should be able to just remove it.
I have seen many errors during the downloading phase of operations. I just rerun the command and it works eventually. It could speak of network issues or just be that they don't allow retrying if there is a error...not really sure what causes it for me.
Hopefully this helps resolve the issue for you.
Nice work! Unfortunately, can't get the jupyter lab file to save. Any suggestions?
Sounds like you have Jupyter Lab running in a container from WSL, then when you try to save the .ipynb file, it does not show up in your WSL file system outside the container?
If so, then it's almost certainly an issue with the "volumes" section of the docker-compose.yaml. If that is working, the other issue might be that you are not looking for it in the proper directory of WSL.
If you want to access files directly on Windows, that's another story. Then you need find and mount the proper path in WSL. You can access your Windows directories in the /mnt directory in WSL. From there you can access your C drive files, etc. Such as: /mnt/c/Users/username/
That last bit is also the way to mount a windows volume containing training data if you are trying to train a model in a Docker container from WSL.
Hope this helps. If so, be sure to hit that like button :)
@@KNuggies Correct, trying access the files outside the container, I found the files are being saved under "docker-desktop-data", where the complete path points to a folder called "overlap2", then under a gibberish folder, I see the typical structure of root, directory (tf-knugs), mount directory, so the files are there, but they are not like the video, where I can't see them in my VSC code folder. Any thoughts?
Sounds like it is creating a Docker Volume instead of linking to your specified directory. I suspect the problem is on the left side of the ":" in your volume definition in the docker-compose.yaml file. Maybe try a full path instead of the ./tf-knugs on the left side. Something like:
volumes:
- /home/username/tf-knuggies/tf-knugs:/tf-knugs
Instead of:
volumes:
- ./tf-knugs:/tf-knugs
Hopefully that sorts it out.
On a related note, if you want to remove any or all volumes Docker created, this page shows how.
docs.docker.com/storage/volumes/#remove-volumes
Lots of other info about docker volumes there as well.
I am having trouble getting this to work. I am on Windows 11 and have RTX2070.
Inside container, nvidia-smi shows my card just fine, but tf.config.list_physical_devices() gives me an error:
E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:266] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
Haven't run into this one myself, but it seems other have when there is a mismatch between the TF version and CUDA version. This shouldn't be an issue with the prebuilt containers.
Others were able to resolve it by specifying the device at the beginning of the notebook:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = "0"
After following your tutorial in your video and in your Medium post, when I ran "docker-compose up" it didn't work for me.
I got the following error: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "jupyter": executable file not found in $PATH: unknown
I'm grateful for any assistance on ensuring the Docker container will start.
I've only seen it fail to find jupyter on the path when trying to run the container as a user instead of root. If this is the case, the solution is a little long for this chat. Basically, the user doesn't have access to where root installed jupyter or if they have access, it's not on the user's path. There are a couple ways around this, but it's much easier to just stay root inside the container if running locally.
I am running into a strange issue at 14:00; `docker-compose up` . The container itself appears to build with no issues, but when attaching it to project-jupyter-lab-1, after a few seconds it exits saying: `project-jupyter-lab-1 exited with code 0`.
Sounds like it exited "normally" with code 0. I've seen this when I don't have the entry point properly defined (i.e. it doesn't launch the server and just exits). Please check that your Dockerfile contains 'ENTRYPOINT ["jupyter", "lab","--ip=0.0.0.0","--allow-root","--no-browser"]' as the last entry. Otherwise, I'm not sure.
@@KNuggies Yep, I have that line exactly; I've also tried using the exact code from your GitHub, it still gives me the same error. Thanks for the help anyways, I'll update if I manage to get it working.
@@a3r797 The only other thing I'd try is opening a terminal in the container and trying to launch jupyter lab, check installations, or just look around to see if you can spot a problem. You can get there by using docker run with the "-it" options or just right click the container from VS Code's Docker Extension.
Hey I'm just getting things set up and I'm having this same issue. Did you ever find a solution?
@@youSTINKER nope, unfortunately I never could figure it out, sorry :/
Instead of Jupyer Notebook, how can you get this to work with VScode?
You can open an existing notebook (ipynb file) in VS Code. The first time you do, it should ask where you how to run code the notebook. If you have the notebook server running, you can choose that as the interpreter for your notebook. You will probably need to provide the full link to the notebook server including the auth token.
VS Code has much better auto complete and hinting than Jupyter Lab for notebooks, but it's not quite the same. I tend to try VS Code once in a while for notebooks then switch back to jupyter lab when I want notebooks.
If you don't want to use notebooks at all and are trying to just use .py files, you can attach to the docker container with VS Code using the Docker addon. Just install the Docker addon in VS code. The Docker addon will display all containers. Just right click on your Tensorflow container and select Attach VS Code.
@@KNuggies Appreciate it, thanks! Will try it.
Nice tutorial on using Docker, I'm completely new to it and even I can understand.
However, when running docker-compose up, it created the container successfully, but when it tried to attach it gave an error. It said: /bin/sh: 1: [: jupyter,: unexpected operator. I have tried various solutions, such as asking github copilot chat, bing chat, and reading other comments on this video. I have deleted the container and tried running the docker-compose but everything still gives me this same error. I have even checked your Medium article, and copypasted the files from there (changing the folder paths, of course) but nothing seems to work.
I hope you know a solution possible or any relevant documents/forums which can aid me in getting rid of this problem.
Thanks for the amazing tutorial,
Soumya
Hard to say since I can't recreate the problem, but is sounds like something from the ENTRYPOINT in Dockerfile is not working. I'd recommend trying the base image with notebooks instead of the custom image (I only made the custom image to use Jupyter Lab instead of the older Notebooks). The change in docker-compose.yaml would look something like:
services:
jupyter-lab:
image: tensorflow/tensorflow:latest-gpu-jupyter
# ...the rest is the same as tutorial
If that works, you'd still need to install other dependencies that you list in requirements.txt, but at least the container would be running. Then you can take the next steps as needed.
This specifies an image instead of using build. Just replace the "build: ." line with the "image: ..." line.
@@KNuggies Thanks a lot! This solution worked for me and got the jupyter lab up and running.
@KNuggies, another question, will I need to copy paste the url into VSCode every single time? I see that the link has changed than what I had yesterday.
@@python_9160 This is a good one to fix. Add the following below the "image: ..." line and it will set the same token every time:
environment:
JUPYTER_TOKEN: "007e1fea040680a3ed5f46b11d61170e2f8132a3a3c45fde"
You can choose a different token of course. Then every time you start it, the token will be the same. The link displayed in the terminal won't include the token anymore, but you can add it yourself resulting in a link like this:
127.0.0.1:8888/?token=007e1fea040680a3ed5f46b11d61170e2f8132a3a3c45fde
If you stuck on container run (with gpu flag), check ypur Docker version. 14.7.1 have a problem with gpu flag. Install 14.7.0. Hope it will help someone (because I wasted a lot of time on it)
Thanks for sharing. Another person just ran into the issue and I verified it on my machine as well. For others that run into the issue, it can be tracked on stackoverflow here: stackoverflow.com/questions/75809278/running-docker-desktop-containers-with-gpus-tag-hangs-without-any-response-in
It looks like the issue has been resolved with Docker Desktop 4.18.0.
Just to inform that Docker desktop 4.17.1 has a bug that makes this tutorial freeze when starting the container. Solution is to downgrade or keep with version 4.17.0
Thanks for sharing. My system just updated to 4.17.1 and it looks like it's struggling as well.
It looks like the issue has already been resolved with Docker Desktop 4.18.0.
it is not working it gives error token error
Afraid I haven't seen this error. Any luck getting it working?
After the newest Win 11 update, the container fails to start with this message:
seb@Dracula:~/project$ docker-compose up
[+] Running 1/0
✔ Container project-jupyter-lab-1 Created 0.0s
Attaching to project-jupyter-lab-1
Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.8, please update your driver to a newer version, or use an earlier cuda container: unknown
Unfortunately, my Windows 11 PC doesn't have a GPU so I can't recreate this one. One fix might be to use a specific / previous version of the tensorflow container. instead of "tensorflow/tensorflow:latest-gpu" try something like "tensorflow/tensorflow:2.11.1-gpu"
The available options can be found here: hub.docker.com/r/tensorflow/tensorflow/tags
I've checked. That did the trick. It all works again with this little, yet so important, change (tensorflow/tensorflow:2.11.1-gpu). On a slightly different issue, how can I suppress the annoying, TF warnings poping up in Jupyter? Thanks again 👍
@@Sebastian-hv7jz That one should be easy. I did it in the Hello World video: ruclips.net/video/F2DR4FGy0LY/видео.html
@@KNuggies OK, thanks. Cool voice by the way, great for teaching!
Great video!
I 've encountered an issue after running docker-compose up:
validating /home/waste-sorting-d/projects/docker-compose.yaml: services.jupyter-lab Additional property depoly is not allowed
Afraid I haven't seen that error before. After a little looking, it seems people have received this error by not having "services:" in the docker-compose file. Might be another issue in the yaml file. I'd recommend checking it carefully.
@@KNuggies I've checked the code again, and guess what :D, I had a typo in "deploy". Found it "depoly".
Thanks for the video and for helping me out. Appreciated.
@@KNuggies One more question, I still don't have a clear understanding why would I have WSL for this. I mean, I can just get the Docker image and use it in Windows, right? Why do I need Linux OS for this?
Can you please help me understanding this question?
You probably don't really need to launch your container from WSL, but having WSL backend for Docker is important. I mostly used WSL so the commands and directory system match Linux.