Cloud 4 Data Science
Cloud 4 Data Science
  • Видео 13
  • Просмотров 100 236
Introduction to Dataform in Google Cloud Platform
This tutorial shows the overview of the Dataform service in the Google Cloud Platform.
Link to the GitHub repo with workflow: github.com/rafaello9472/dataform-demo/branches/stale
Dataform documentation: cloud.google.com/dataform/docs/overview
00:00 Introduction
00:42 What is Dataform?
01:10 Key Features and Benefits
02:30 Why Dataform?
02:43 Key Concepts
04:19 From SQL to SQLX
04:57 Version Control and Collaboration
06:03 Dependency Management
07:30 Javascript in Dataform
09:50 Workflow Execution Scheduling Options
10:35 Creating Dataform Repository
11:41 Creating Necessary GitHub Assets
13:41 Adding Access Token to Secret Manager
14:18 Adding IAM Roles to Service Account
15:46 Creating Development Works...
Просмотров: 31 886

Видео

Automate Python script execution on GCP
Просмотров 28 тыс.Год назад
This tutorial shows how to automate Python script execution on GCP with Cloud Functions, Pub/Sub and Cloud Scheduler. Link to the GitHub repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Automate Python script execution on GCP 00:00 Introduction 00:21 Architecture overview 01:09 GUI - Pub/Sub 01:28 GUI - Cloud Functions 02:39 Python code walkthrough 05:18 GUI - Cloud Sch...
Create Text Dataset in Vertex AI
Просмотров 3,7 тыс.Год назад
This tutorial shows how to create a Text Dataset in Vertex AI for single-label classification and sentiment analysis tasks. Link to the GitHub repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Create text dataset in Vertex AI Kaggle Ecommerce Text Classification Dataset: www.kaggle.com/datasets/saurabhshahane/ecommerce-text-classification Kaggle Twitter and Reddit Sentim...
Predict with batch prediction in Vertex AI - Image Classification
Просмотров 2,7 тыс.2 года назад
This tutorial shows how to make predictions on image classification dataset with batch prediction in Vertex AI. Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Predict with batch prediction in Vertex AI - Image Classification Create model used in this tutorial: ruclips.net/video/dl-UNtgLC1s/видео.html&ab_channel=Cloud4DataScience Input data requireme...
Train AutoML Image Classification model in Vertex AI
Просмотров 1,2 тыс.2 года назад
This tutorial shows how to train an AutoML image classification model in Vertex AI with Python SDK. Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Train AutoML model in Vertex AI - Image classification Create Image Dataset used in this tutorial: ruclips.net/video/39PxXRvo7qw/видео.html&ab_channel=Cloud4DataScience 00:00 Introduction 00:16 Dataset us...
Create Image Dataset in Vertex AI
Просмотров 2,6 тыс.2 года назад
This tutorial shows how to create an Image Dataset for a single-label classification task in Vertex AI. Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Create image dataset in Vertex AI Kaggle Lemon Quality Dataset: www.kaggle.com/datasets/yusufemir/lemon-quality-dataset Prepare image training data for classification: cloud.google.com/vertex-ai/docs/...
Run custom training job with custom container in Vertex AI
Просмотров 6 тыс.2 года назад
This tutorial shows how to run a custom training job with a custom container in Vertex AI. Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Run custom training job with custom container in Vertex AI Kaggle Stroke Prediction Dataset: www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset Environment variables for special Cloud Storage directorie...
Run custom training job with pre-built container in Vertex AI
Просмотров 3,9 тыс.2 года назад
This tutorial shows how to run a custom training job with a pre-built container in Vertex AI. Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Run custom training job with pre-built container in Vertex AI Kaggle Stroke Prediction Dataset: www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset Environment variables for special Cloud Storage dire...
Predict with online prediction in Vertex AI
Просмотров 4,4 тыс.2 года назад
This tutorial shows how to make predictions on tabular dataset with online prediction in Vertex AI. Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Predict with online prediction in Vertex AI Link to gcloud CLI authorization: cloud.google.com/sdk/docs/authorizing 00:00 Introduction 00:36 Endpoint setup 03:22 Online prediction - Python 08:32 Online pr...
Predict with batch prediction in Vertex AI
Просмотров 4,9 тыс.2 года назад
This tutorial shows how to make predictions on tabular dataset with batch prediction in Vertex AI. Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Predict with batch prediction in Vertex AI 00:00 Introduction 00:33 Dataset discussion 01:49 Batch prediction job setup in Jupyter 03:31 Going through predictions in BigQuery 04:12 Transforming raw results...
Train AutoML model in Vertex AI
Просмотров 2,6 тыс.2 года назад
This tutorial shows how to train AutoML classification or regression model on tabular dataset in Vertex AI. Google documentation describing tabular dataset preparation: cloud.google.com/vertex-ai/docs/datasets/prepare-tabular Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/blob/main/Train AutoML model in Vertex AI/classification.ipynb Kaggle Stroke Prediction ...
Create Tabular Dataset in Vertex AI
Просмотров 2,8 тыс.2 года назад
This tutorial shows how to create Tabular Dataset in Vertex AI, from BigQuery table, Google Cloud Storage CSV file or Pandas DataFrame. Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Create tabular dataset in Vertex AI 0:00 Datasets creation options 0:42 Create from BigQuery table 2:58 Create from Google Cloud Storage CSV file 4:12 Create from Panda...
Connect Jupyter Notebook with Vertex AI
Просмотров 6 тыс.2 года назад
This tutorial shows how to connect your Jupyter Notebook with the Vertex AI from both GCP and the local environment. Link to the documentation describing process of setting the environment variable: cloud.google.com/docs/authentication/getting-started#setting_the_environment_variable Link to the Github repo with code from this tutorial: github.com/rafaello9472/c4ds/tree/main/Connect Jupyter Not...

Комментарии

  • @ahmetozturk2016
    @ahmetozturk2016 Месяц назад

    It was wonderful and fluent expression. Thank you. I am a bit confused. All the explanations can be done using standard SQL. What is the benefit of Dataform. Thanks in advance.

  • @sarathysrm
    @sarathysrm 3 месяца назад

    Thanks for this wonderful insights

  • @EternalAI-v9b
    @EternalAI-v9b 3 месяца назад

    Hello, are you still active and availble for questions? I would like to ask you if it's possible to have a function that run and close a VM within google cloud itself? Thanks

  • @Pieter-JanDeLaere
    @Pieter-JanDeLaere 3 месяца назад

    Thank you for the interesting explanation. Would it be possible to share the slide deck? That would be great

    • @cloud4datascience772
      @cloud4datascience772 3 месяца назад

      Sure, here you go docs.google.com/presentation/d/1w0dthQkz5Wo84vDqO4OWZde9W_JjrmqHxDonXEKIisM/edit?usp=sharing

    • @Pieter-JanDeLaere
      @Pieter-JanDeLaere 3 месяца назад

      @@cloud4datascience772 Thanks I can use it to explain some more to my colleagues

  • @algorithmo134
    @algorithmo134 4 месяца назад

    excellent tutorial! Do you have a video on pipeline with vertex AI?

  • @algorithmo134
    @algorithmo134 4 месяца назад

    Hi I love your videos so far! Can you make a video on vertex pipelines with cloud scheduling for model training and deployment to an application? Looking forward for your reply!

  • @alphonsemeltz314
    @alphonsemeltz314 4 месяца назад

    Great video, very clear. Thank you

  • @marcinlaskowski3357
    @marcinlaskowski3357 5 месяцев назад

    Hi, as I can see, there are no branches in Github linked under the vide? Is it because there will be some code updates and upgrades coming? Very well presentation, and I should say that before watching the turorial I had few places that gave me hard stops.. Now everything seems clearer. Thank you! Kind regards.

    • @cloud4datascience772
      @cloud4datascience772 5 месяцев назад

      Hi, I am glad that you liked the video, thank you for the kind words! When it comes to branches, github moved it to stale branches, but you can still access those: github.com/rafaello9472/dataform-demo/branches/stale

    • @marcinlaskowski3357
      @marcinlaskowski3357 5 месяцев назад

      @@cloud4datascience772 🙏🏻Thank you very much, I'll definitely do that ;)

  • @joaomanoellins2219
    @joaomanoellins2219 6 месяцев назад

    Thank you! Great video

  • @aljauzi1941
    @aljauzi1941 6 месяцев назад

    this is very helpful, thank you

  • @blademan-5
    @blademan-5 6 месяцев назад

    This video was very helpful for me! Thank you 👍

  • @patriciodiaz2377
    @patriciodiaz2377 6 месяцев назад

    I have a question, what about if I had a script with selenium (web scraping) library? There would be a problem right, because as far I can understand it needs a driver installed in your machine to work 😪

    • @cloud4datascience772
      @cloud4datascience772 6 месяцев назад

      You can install libraries using requirements.txt, otherwise it could be a challenge if you want to customize the execution environment

  • @patriciodiaz2377
    @patriciodiaz2377 6 месяцев назад

    Thanks a lot for sharing your knowledge!! Greetings from Mexico

  • @AdhvaithG
    @AdhvaithG 6 месяцев назад

    Hi ... I wanted to take a moment to express my appreciation for your videos. They have been incredibly helpful as I learn about GCP, and your clear explanations make complex topics much more accessible. One question here, here every thing we are doing it manually using GUI and alternatively can we do the same process using Python SDK right from building package and pushing the package to Cloud Bucket and then training setup, Model Registry, Deployment and end point creation? If yes, when you get a time can you please post a video on this?

    • @cloud4datascience772
      @cloud4datascience772 6 месяцев назад

      Thank you for the kind words! You are right, majority of the things can be done also by using Python SDK or gcloud CLI tool. I always try to focus first on the GUI approach as once you learn it it’s much easier to automate entire process with code. Unfortunately due to lot of project work on my end recently, I am not planning any new videos on that topic in the nearest future. I might come back to it, but I don’t know when it might happen. Hope that this video will be sufficient starting point for you!

  • @ekhemka-x6q
    @ekhemka-x6q 7 месяцев назад

    I want to create a text data set, but all of my text is in pdf form. How would I go about doing that?

    • @August-m8l
      @August-m8l 4 месяца назад

      same, did you find a resolution on this?

    • @ekhemka-x6q
      @ekhemka-x6q 4 месяца назад

      @@August-m8l I didnt create a text dataset - I just used the pdfs as is. I created a GCS bucket with the pdfs, and i used gemini multimodal to process the text within the files. Hope that helps!

    • @August-m8l
      @August-m8l 4 месяца назад

      @@ekhemka-x6q thanks!

  • @flosrv3194
    @flosrv3194 7 месяцев назад

    what is the zip file ? i didnt understant what is it about..

  • @saurabhbhardwaj3112
    @saurabhbhardwaj3112 8 месяцев назад

    Thanks for sharing this brief and informative tutorial! Really helpful👍

  • @AbhijitKumar-uw8fd
    @AbhijitKumar-uw8fd 8 месяцев назад

    Thanks, very informative video. You create a git repository with public mode, where you able to connect the private git rep too?

  • @SonaliSrijan
    @SonaliSrijan 8 месяцев назад

    Hello, thanks for your helpful videos! Question: I want to perform batch prediction for a foundation prebuilt model (llama3). I have downloaded Llama 3 chat-8b model into VertexAI Model registry. When I try to start a batch prediction job, I get the following issue: InternalServerError: 500 Unknown ModelSource source_type: MODEL_GARDEN model_garden_source { public_model_name: "publishers/meta/models/llama3" } for model projects/591244989428/locations/us-east1/models/llama3_chat_8b@1 Any idea on wha the issue is about? I didn't find any helpful resources on this. Appreciate any help!

  • @Satenc0
    @Satenc0 8 месяцев назад

    Is it possible to use javascript parametrizable variables on your sqlx files for the queries?

  • @Satenc0
    @Satenc0 8 месяцев назад

    How to use bigquery tables to read, make some transformations and write to other tables, with dataform?

  • @iamkeithfajardo
    @iamkeithfajardo 9 месяцев назад

    Great video :)

  • @annie_yeong
    @annie_yeong 9 месяцев назад

    Thank you so much!! I have a question though, it was said that to link a third-party remote repository to a Dataform repository, you need to first authenticate it. Any thoughts on this?

  • @Rajdeep6452
    @Rajdeep6452 9 месяцев назад

    Lol it’s not working. Can’t deploy. What can be the problem? I did exactly what you did, except can’t create the same bucket so named the bucket as c4ds1. Also changed the code for that.

  • @MohamedFerrouga
    @MohamedFerrouga 10 месяцев назад

    Please answer to my question, I need to do the same thing as you did in this video. My python script works just fine under Google cloud shell. However, I am still having trouble making it work as cloud function. The purpose is to schedule the execution of the function. It consist of extracting a data from a web site and save in google sheet. I was able to make run it under google cloud shell. Any clue from you ?

  • @JishnuMittapalli
    @JishnuMittapalli 10 месяцев назад

    Can we do this using google compute like GPU? how can we do that?

    • @cloud4datascience772
      @cloud4datascience772 10 месяцев назад

      There is no easy direct way to do it with Cloud Functions. For GPU you would need to use Google Compute Engine and select GPU machine type, but process of automating some script execution would be much different, and it's not covered in my tutorial.

  • @bhoomivaghasiya2794
    @bhoomivaghasiya2794 10 месяцев назад

    Thanks for this video! I have no knowledge of AI/ML. I just want to use vertex ai for my mobile application purposes. Now I have a dataset in the Kaggle. I have downloaded the dataset and all are the images. Now I want to use that to create an model and use API into my mobile app. How to do it in the GUI of vertex AI? I mean there are only images. As there isn't any CSV or JSON file, it's very time-consuming to upload 100 of images and label every one because I will be using Image Object detection. Is there any direct way to get the CSV or JSON from kaggle. or Can I get a trained model direclty. what's the exact flow to do get what I want?

    • @cloud4datascience772
      @cloud4datascience772 10 месяцев назад

      Hi, you need to prepare either a CSV or JSONL file with image locations and labels, once you have it, use it to create an image dataset in Vertex AI, this, of course, requires some programming knowledge as I don't believe there will be a ready file for that purpose on Kaggle. I'm showing the process of creating such file from images I uploaded to Cloud Storage, hopefully, it can be a good start for you. If you have any doubts regarding the file, you can always refer to the documentation => cloud.google.com/vertex-ai/docs/image-data/classification/prepare-data

  • @umamaheshmeka1032
    @umamaheshmeka1032 10 месяцев назад

    Great job! I followed your instructions, and everything started working smoothly for me. Your tutorial is fantastic - keep up the excellent work! You have the potential to reach 1 million subscribers. Keep pushing forward !!!

    • @cloud4datascience772
      @cloud4datascience772 10 месяцев назад

      That is great to hear! Thank you for the kind words :)

  • @59600muslim
    @59600muslim 10 месяцев назад

    Thank you ! very clear your explanations !

  • @xyz-jn4oj
    @xyz-jn4oj 10 месяцев назад

    hey what about model deployment? can u make video on it?

  • @GarimaSingh-x4w
    @GarimaSingh-x4w 11 месяцев назад

    Where can I get the stroke data

    • @cloud4datascience772
      @cloud4datascience772 11 месяцев назад

      Here you go => www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset

  • @alperakbash
    @alperakbash 11 месяцев назад

    Thank you so much for such a wonderful tutorial. Perfectly designed and structured. And also thank you so much to help me to meet such a powerful platform.

  • @victorricardo8482
    @victorricardo8482 11 месяцев назад

    Hi! First of all, great video! Really simple and intuitive. It worked for me when I called only one function inside the hello_pubsub, but when I tried to call several others, from others .py, looked like the function run perfectly but with no results. Is there a way to make the cloud functions wait before every function finishes before moving to the next one? Thanks

    • @victorricardo8482
      @victorricardo8482 11 месяцев назад

      In other words, I need the second function to get the result from the first, and so on. Thats why I need it to wait

  • @EnoEbiong
    @EnoEbiong 11 месяцев назад

    I am so excited about the whole explanations

  • @EnoEbiong
    @EnoEbiong 11 месяцев назад

    This is great

  • @EnoEbiong
    @EnoEbiong 11 месяцев назад

    It’s very important to listen to this lecture and following same to be able to make online business accounts realized

  • @EnoEbiong
    @EnoEbiong 11 месяцев назад

    This is awesomely exciting news about your business account

  • @EnoEbiong
    @EnoEbiong 11 месяцев назад

    Depository platform is a very wide program to use in creating your own Database platform

  • @EnoEbiong
    @EnoEbiong 11 месяцев назад

    I am so excited about your examples like Java print depository and all others you’ve mentioned

  • @EnoEbiong
    @EnoEbiong 11 месяцев назад

    Thanks again and again for sharing this Database Depository code

  • @EnoEbiong
    @EnoEbiong 11 месяцев назад

    I appreciate your interest in sharing this wonderful business opportunities online with your friends and family members who love sharing their business opportunities online with their businesses

  • @EnoEbiong
    @EnoEbiong 11 месяцев назад

    Thanks so much for your concern about this wonderful program on Database form for businesses online

  • @enricocompagno3513
    @enricocompagno3513 11 месяцев назад

    How to make batch prediction with a custom container? It would be nice to have a tutorial that uses a custom container to run a run and save it in model registry and then run a batch prediction

  • @siddharthalama8319
    @siddharthalama8319 Год назад

    This video saved me. I was almost losing my patience while looking for these codes in the GCP platform, and I couldn't find them. Instead of using Jupyter Notebook, I am implementing them as a cloud function to automate the process of training every three months. THANK YOU.

  • @WidadZizouanewalo
    @WidadZizouanewalo Год назад

    thanks for the demo. what if I have two versions of the model, and I want to use the second one instead of the first one ?

  • @ChandankumarGUPTA-p8n
    @ChandankumarGUPTA-p8n Год назад

    Great video!!!!!!, Could you please let me know, how to use this managed datasets in the Kubeflow component which will be further used to execute the vertex Ai Pipeline.

  • @alphaalpha4595
    @alphaalpha4595 Год назад

    best indian ever love u from morocco snor and chih and sk7k7 and hatim l7waa

  • @abdullahnasir8535
    @abdullahnasir8535 Год назад

    Love you man

  • @bonyadnouri6548
    @bonyadnouri6548 Год назад

    for me project was not equal to project name but id. seems to be the solution to a 403 for people on Stackoverflow

  • @CleytonPereira-g6u
    @CleytonPereira-g6u Год назад

    Is it possible to read a file directly from Google Cloud Storage using dataform?