How to create kubeflow pipeline from scratch | Live Demo | Machine Learning | Ashutosh Tripathi

Ashutosh Tripathi

Просмотров 15 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 17 окт 2024
How to create kubeflow pipeline from scratch | Live Demo | Machine Learning | Ashutosh Tripathi
End to End Jupyter Notebook Explanation for Kubeflow pipeline building and executing
Topics Covered:
1. Python function needed to train and predict
2. Creating components from python functions
3. Initialise kubeflow pipeline
4. define the pipeline function and put together all the components
5. Mounting volume for component's output storage
6. Compiling pipeline and generating yaml - it can be directly uploaded to kubeflow and create experiments and runs using UI
7. Create run from pipeline function using the code
8. How to disable cache to see the each steps output on second and successive runs
Notebook link:
github.dev/Tri...
Part 2: CSV file passing between kubeflow components: • How to pass csv and da...
If you find this video helpful, don't forget to like share and subscribe. This is how you can support me.
Connect me:
LinkedIn: / ashutoshtripathiai
Instagram: / ashutoshtripathi_ai
Twitter: / ashutosh_ai
Website: ashutoshtripat...
If you want to message me directly, then connect me on LinkedIn and send a DM.
#machinelearning #kubeflow #mlops

Комментарии • 119

@AshutoshTripathi_AI Год назад ⁺³
Video on Kubeflow pipeline installation on windows:
ruclips.net/video/LSvvIt2m1Jo/видео.html
@akshaykotawar5816 7 месяцев назад ⁺¹
Thankyou sir iam looking for this topics from very long period
@AshutoshTripathi_AI 7 месяцев назад
Welcome
@BIZSURESH Год назад ⁺²
EXCELLENT ..YOUR TUTORIAL IS VERY HELPFUL FOR LEARNING ABOUT MLOPS.......BRO.....👌👌🙌🙌🙌🙌🙏🙏🙏🙏🙏
@AshutoshTripathi_AI Год назад
Thank you Brother.
@kanakorn 6 месяцев назад
Great job, I can run my first pipeline from this tutorial. Thanks.
@pradipkarad6837 Год назад ⁺²
Thanks @AshutoshTripathi_AI ! Your contents are very much exciting and with full of knowledge. Can you please provide a video of full kubeflow components locally ?
@praveenkuthuru7439 2 месяца назад
Your work is really impressive. I have been following your videos and gaining a lot of knowledge. excellent work...keep it up!!!
@AshutoshTripathi_AI 2 месяца назад
Thank you Praveen 🙏
@mbmathematicsacademic7038 Месяц назад ⁺¹
walking into the new week with MLOps as a new skill on my set of skills😎
@AshutoshTripathi_AI Месяц назад
That's really great....
@MsRAJDIP Год назад ⁺¹
Your way of explaining is really good.😊
@AshutoshTripathi_AI Год назад ⁺¹
Thank you
@AmitYadav-ig8yt Год назад ⁺²
Thanks a lot Brother. One of the best videos on this concept. May you please do the same steps in GCP?
@AshutoshTripathi_AI Год назад ⁺¹
Will try to create one on GCP.
@KSANTOSHKUMAR-ge5xr Год назад ⁺²
Excellent tutorial... Please make a video on Kubeflow installation.
@AshutoshTripathi_AI Год назад ⁺¹
Here is the video link on kubeflow installation locally on windows: ruclips.net/video/LSvvIt2m1Jo/видео.html
@sajadsafarveisi4512 Год назад ⁺¹
Thanks a lot for the tutorial (this one turned on my engine). One question. What if we want to create a component not from a function but from an instance of a custom resource? Assume that the instance kind is SparkApplication (with the associated operator already created under some namespace).
@Veer1516 10 месяцев назад
If you have something in a spark app, why not just create a spark pipeline?
Im actually asking, I wanna know the scenario in which you use both
@tushitdave9795 Год назад
Good one, Thanks.. However can you tell me about kfp module. I have installed Kubeflow in my base environment however when I did open notebook and imported kfp it is not recognised , I did tried pip install kfp and kubeflow both on my Jupyter notebook. Please put some torch on this.
@mdowais4322 4 месяца назад
Hi Ashutosh thanks for master piece video, can you help me to understand about the storage. I want to use postgreSQL or any relational database how can I interact with relational database ?
@nissarahmad8545 Год назад ⁺¹
Nicely explained E2E flow
@AshutoshTripathi_AI Год назад
Thank you
@RAKESHKUMARSINGH-tp7mk Год назад ⁺¹
Great way to get introduced to Kubeflow Pipeline.
Where can I get the source code for the example you have demonstarted. Kindly let us know. I would like to try it on my Kubeflow deployment.
@AshutoshTripathi_AI Год назад ⁺¹
Hi, I have updated the description of the video with the notebook link. use the link to download the kf-pipeline notebook. let me know if you face any difficulty in downloading.
@Dr.SureshPanchal 10 дней назад
do we need Kubernetes preinstalled?
@shivaprasad1277 Год назад ⁺¹
Hi @Ashutosh. Evrytime i run the pipeline in the Kubeflow. I am getting logs as "This step output is taken from cache." Can you please help me?
@AshutoshTripathi_AI Год назад
You need to disable cache while creating the pipeline.
def some_pipeline():
# task is a target step in a pipeline
task = some_op()
task.execution_options.caching_strategy.max_cache_staleness = "P30D"
@AshutoshTripathi_AI Год назад
You can also refer this document
www.kubeflow.org/docs/components/pipelines/v1/overview/caching/
@astrovedics Год назад ⁺¹
Hello, I am new to this whole data science concept. So my questions can be silly. Can i setup model registry and Model Tracking UI on JFrog artifactory?
@AshutoshTripathi_AI Год назад
As for i know jfrog is a repository manager where you can store docker images, handle CI CD. But I m not sure if we can use this for model registry. As I know it is not used for model regyand tracking purposes but need to be double checked.
@keerthigavenkatesh3806 Год назад
Can you please make a video of how you are managing data ( for image dataset) in the bucket and accessing them in the program and kubeflow, please!
@RaushanKumar-ut2ke Год назад ⁺¹
Hi Ashutosh, You are reading csv file from Git. But when i am trying to read from Local directory then it is giving me error no such directory, i am using Xeroflow for this , is there a different way to read from local directory.
@AshutoshTripathi_AI Год назад
Actually while running the pipeline, your local directory is not accessible from inside the pod. Hence just keep the CSV in some online repo and read it.
@Sam-nn3en Год назад
Hello, in terms of comparison what did you find better to use kubeflow or MLflow. It seemed like kubeflow was hanging and was using extra resources. We haven't done heavy pipeline runs and was curious to know
@AshutoshTripathi_AI Год назад ⁺¹
Kubeflow I used for pipeline creation and mlflow for model registry. Kubeflow provides registry with minio but mlflow seems more user friendly and feature rich.
@Sam-nn3en Год назад ⁺¹
@@AshutoshTripathi_AI Thank you for sharing. Yes, that was very relevant from the other MLflow video you made. It does model serving with registry very nicely.
@camiloperez2376 10 месяцев назад
Thanks for share!. Where is te doc 'IRIS_Classifier_pipeline1.yaml' for download?
@adilshaikh9123 Год назад
Sir as of now I have created the MLFLOW UI which is logging all the metrics and artifacts are exactly as shown in your previous MLFLOW video and on other hand I have written the separate Kubeflow pipeline code like done in this video and my pipeline is also created successfully. But how come I can Integrate MLflow as a part of Kubeflow as both are separate as of now???
@datasciencewitharbaaz5221 Год назад ⁺¹
Hello Sir, very nice explanation. I have one doubt cant we use .py files rather than ipynb files? since I have an entire project. with different functionalities based on dataset.
@AshutoshTripathi_AI Год назад ⁺²
Yes you can use .py file. Even in .ipynb file every chunk can be considered as a separate .py file
@datasciencewitharbaaz5221 Год назад ⁺¹
@@AshutoshTripathi_AI can we do model versioning in kubeflow if yes then how sir, can you give an idea or any possible solution.
@mateopolancec8478 Год назад
@@datasciencewitharbaaz5221 use MLFlow for that look at my previous answer how to use MLFlow with KubeFlow pipelines.
@ShailendraMishra26 Год назад ⁺¹
Hi Ashutosh,
This video was very helpful. I am stuck on one point. Pls help.
What is the process if we want to execute a task, after multiple task is executed. Is there any option in .after method to add more tasks. Any help would be greatly appreciated.
@AshutoshTripathi_AI Год назад
Do u mean to run tasks in a sequential manner?
@ShailendraMishra26 Год назад
Yes
@ShailendraMishra26 Год назад
@@AshutoshTripathi_AI could you please help one above ask?
@AshutoshTripathi_AI Год назад
@@ShailendraMishra26 hi Shailendra, i replied to your above question. I did not understood what you exactly mean. Do you mean to run your task sequentially mean one after another for example if task two is dependent on output of first task then task 2 should wait for first task to finish? Is this what you are expecting?
@ShailendraMishra26 Год назад ⁺¹
@@AshutoshTripathi_AI Yes I want to run sequentially. But my ask is I have 3 tasks, third should be executed once other two is executed. Output of two tasks is required to run the third one. I want to check if there is any way by which I can pass multiple output parameter in after method?
@purvijain-j1g Год назад
Hi, thanks for the video,
Although I am not able to execute the code because the pipeline is not able to access the data file. I have tried giving absolute path as well but no luck. Can you help me
@madhavilatha716 9 месяцев назад
This code no more supports with latest version 2.4.0 any help?
@reddyvarinaresh7924 Год назад ⁺¹
Nice Ashutosh !
@AshutoshTripathi_AI Год назад
Thank you.
@yasshhh-y1u Год назад
Hi Ashutosh thanks for your session but for me when I started pipline t-vol is showing .This step is in pending state with this message :ContainerCreating
@kirancrazy393 10 месяцев назад
I was trying to replicate your code , but getting this error : AttributeError: module 'kfp.components' has no attribute 'create_component_from_fun' . my kfp version 2.4.0
how to fix this
@AshutoshTripathi_AI 10 месяцев назад
In this case please refer kubeflow official documentation of version 2.4.0 if they have changed the method name.
@jilanikashif Год назад
Thanks for sharing valuable information , I was looking for Kubeflow tutorial for long time. One thing which I am not getting clear is how to setup dashboard for kubeflow.
@AshutoshTripathi_AI Год назад
Ok. So do you mean the central dashboard for kubeflow where we see all the components of kubeflow like notebook server, volume, experiment, contributors..etc.....?
If yes then for this you need to deploy complete kubeflow on a kubernetes cluster.
It requires a lot of memory that is why I setup only kubeflow pipeline locally which suffices the main work for data scientists.
@jilanikashif Год назад
@@AshutoshTripathi_AI how we can setup locally, i have followed tutorial and created yaml file. Now I am stuck to upload yaml file locally and see pipeline
@jilanikashif Год назад
@@AshutoshTripathi_AI Please help on that to install locally and see pipeline
@AshutoshTripathi_AI Год назад
@@jilanikashif ok. I will create a video on installation soon but till then you can follow below step to install kubeflow pipeline SDK locally:
1. Install docker desktop
2. Install minikube. So just type minikube installation in Google search and open the official site. Then just follow those steps.
3. Start minikube using minikube start command
4. Type in google- kubeflow pipeline installation locally then open the kubeflow page and scroll down. There you will find there are two command which you need to execute and finally the port forwarding.
5. Once you done till this point kubeflow pipeline will be installed locally.
@jilanikashif Год назад ⁺¹
@@AshutoshTripathi_AI Thanks for replying and sharing knowledge, I have followed till Minikube start and its working, however for kubeflow pipeline installation it's not been working. Could you please share that page which shows command to setup in locally and port forwarding.
@chandrashekhartiwari508 Год назад
Hi sir, can we use both mlflow and kuber flow in a project
@AshutoshTripathi_AI Год назад
Kubeflow has its own artifact registry which uses minio for storage.
However if you want to use mlflow with kubeflow then you have to integrate mlflow with kubeflow.
U can use teraform to do this. I have not done this as this mainly need devops knowledge. Please refer kubeflow documentation they have some documents which u can refer.
@geetatripathi9335 Год назад ⁺¹
Good 👍
@AshutoshTripathi_AI Год назад
🙏
@placementandjobs4102 Год назад
Sir for example if any component fail kubeflow pipeline how i can skip and started next componet for example i have 3 componet a, b, c
b is fail i want run c even if b is fail or sucess how to achive this because when b is fail i will not move next component c so how we can do it.
@AshutoshTripathi_AI Год назад
If components are dependent on others then they have to run sequentially else they will run parallel without depending on others.
For sequentially execution you cant skip.
@ramanjulubodisetty3665 Год назад
Hii Ashutosh,, I am getting error @while Kubeflow_Pipeline... Its showing like there no file directory path.. with out using S3 buckket u have any Suggestion to read the Dataset Plz,,,,
@AshutoshTripathi_AI Год назад ⁺¹
You can read it from GitHub repository. gs bucket etc
@ramanjulubodisetty3665 Год назад
Sir I wants to become an MlOps expert can u plz,, suggest me any crack course like institute
@sumitchauhan8245 Год назад ⁺¹
How can I find the Session cookie, could you please share the steps in order to get session cookie. Thanks
@AshutoshTripathi_AI Год назад
In the browser just right click and click ok inspect option. Then click on application tab there at left side and you will see the cookies option. Then expand that and you will see the url. Just click on the url then on the body section you will find the auth_session. Just copy that long string. This is your browser auth session cookie I'd.
@sumitchauhan8245 Год назад
I did the same thing but after running my pipeline on server I am getting this error :
702 # Make the request on the httplib connection object.
- -> 703 httplib response = self. make request
704
conn,
705
method.
706
url,
707
timeout=timeout obj,
708
body=body,
709
headers=headers,
710
chunked chunked,
711
713 # If we're going to release the connection in finally:
)
then
714 # the response doesn't need to know about the connection. Otherwise
715 # it will also try to release it and we'll have a double-release
716 # mess
@sumitchauhan8245 Год назад ⁺¹
What should be the namespace parameter, the notebook name ??
@AshutoshTripathi_AI Год назад
No. Not the notebook name. By default the namespace is kubeflow. But if you are working in server deployed one then ops team might have created multiple accounts for different users. So you need to check. If u see it in the url then also u can find the namespace parameter.
As a concept kubeflow is multitenant so user accounts are segregated based on nespaces
@placementandjobs4102 Год назад ⁺¹
Sir how to add Jupiter notebook in kubleflow?
@AshutoshTripathi_AI Год назад
Replied in ur other comment.
@devanshumishra6430 Год назад
How we Integrate it with Kserve?
@lug__aman Год назад
brother not working module 'kfp.components' has no attribute 'create_component_from_func'
@AshutoshTripathi_AI Год назад
Please check the version you are installing. It might be the case in upgraded version they have renamed it or new method came.
@lug__aman Год назад
@@AshutoshTripathi_AI i am using the 2.0.1 may be some function name would change but there is no latest documentation out there. I am facing the problem any latest documentation is available?? I checked the Kubeflow document but it's not updated
And you are version 1.8.18 I am not able to install this specific version 1.8 using pip
@AshutoshTripathi_AI Год назад
@@lug__aman github.com/kubeflow/pipelines/issues/7794#issuecomment-1164986300
In kfpv2 doc is suggesting to use @component as decorator. Above function is deprecated
@AshutoshTripathi_AI Год назад
@@lug__aman please refer this url.
www.kubeflow.org/docs/components/pipelines/v2/pipelines/pipeline-basics/
@lug__aman Год назад
@@AshutoshTripathi_AI i followed the document in there are some code already written in docs for the testing it has already created some function and pipelines I copy paste all things from docs for test then I compiled then it's created yaml file I simply upload in kubeflow ui which install in a cluster
But I am getting this error :
Cannot get MLMD object from meta store
@geetatripathi9335 4 месяца назад
Very good beta
@AshutoshTripathi_AI 4 месяца назад
Thank you
@kirancrazy393 10 месяцев назад ⁺¹
Can I have your githib repo link please
@AshutoshTripathi_AI 10 месяцев назад
It is there in the description of the video.
@datasciencewitharbaaz5221 Год назад ⁺¹
Why it is not creating visualizations for metrics confusion metrics ?
@AshutoshTripathi_AI Год назад
In this video I have not covered the visualization part on kubeflow pipelines. Are you getting any error ?
@datasciencewitharbaaz5221 Год назад
@@AshutoshTripathi_AI I went through the documentations, but didnt find anything I am not getting any visualizations as it says. "No Visualizations generated, create manually." But automatically it should create righ?
@AshutoshTripathi_AI Год назад
@@datasciencewitharbaaz5221 i am not sure what you are doing to generate visualization. What I am thinking let me check the visualization part in kubeflow pipeline and will let u know how to generate and store.
@satyam70 3 месяца назад
do u take any class
@AshutoshTripathi_AI 3 месяца назад
I used to take it. Stopped for some time due to other work.
@pankajjaiswal3907 9 месяцев назад
this code is outdated for the current version there are manny-many errors in this code you change the code according to the new version
@AshutoshTripathi_AI 2 месяца назад ⁺¹
Please refer to the official document for the updates on the newer version.
@vishalwaghmare3130 Год назад
What is @ds1
@AshutoshTripathi_AI Год назад
Not sure. Have I mentioned it anywhere in the video? Let me know. Thanks
@vishalwaghmare3130 Год назад
at 12:47
@AshutoshTripathi_AI Год назад ⁺¹
kfp.dsl contains the domain-specific language (DSL) that you can use to define and interact with pipelines and components.
You can read about it here:
www.kubeflow.org/docs/components/pipelines/v1/sdk/sdk-overview/
@keerthigavenkatesh3806 Год назад
I am facing the following error. Does anyone know how to solve it?
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[5], line 1
----> 1 create_step_prepare_data = kfp.components.create_component_from_func(
2 func=prepare_data,
3 base_image='python:3.7',
4 packages_to_install=['pandas==1.2.4','numpy==1.21.0']
5 )
AttributeError: module 'kfp.components' has no attribute 'create_component_from_func'
@AshutoshTripathi_AI Год назад
Just check which version kubeflow pipeline you are using. In older version it was not there. Try to refer kubeflow document
@keerthigavenkatesh3806 Год назад ⁺¹
@@AshutoshTripathi_AI I was using the newer version, and now the error is resolved. Thanks a lot Ashutosh!
@AshutoshTripathi_AI Год назад
@@keerthigavenkatesh3806 good to hear.
@pankajjaiswal3907 Месяц назад
you did not clear the data/ path
@AshutoshTripathi_AI Месяц назад
??
@pankajjaiswal3907 Месяц назад
@@AshutoshTripathi_AI path of the data folder
@saadnajar2858 Год назад
First of all thanks for the video , I have a problem while creating the kfp.client () it prints : Failed to load kube config. MaxRetryError: HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
@adilshaikh9123 Год назад
Hey did you got any solution I'm also facing the same issue!!!
@yasshhh-y1u Год назад
Hi Ashutosh thanks for your session but for me when I started pipline t-vol is showing .This step is in pending state with this message :ContainerCreating
@placementandjobs4102 Год назад ⁺¹
Sir how to add Jupiter notebook in kubleflow?
@AshutoshTripathi_AI Год назад ⁺¹
For that u need to install complete kubeflow with all components which requires lot of resources. Hence what I suggest you can still install jupyter notebook with anaconda and use it o build pipeline and then connect the kubeflow pipeline as shown in the tutorial.

Следующие

Автовоспроизведение

How to install kubeflow (pipeline) locally on windows | docker desktop | minikube installation