Creating an ETL Data Pipeline on Google Cloud with Cloud Data Fusion & Airflow - Part 1

Поделиться
HTML-код
  • Опубликовано: 6 янв 2025

Комментарии • 57

  • @AR-by2lk
    @AR-by2lk 9 месяцев назад

    Thank You Vishal for doing this. It will be definitely a great help! Kudos to you!

  • @rajeshiyer4999
    @rajeshiyer4999 5 месяцев назад +1

    Thanks Vishal for the detailed pipeline design and development video. Great job.

  • @harinarayanan5975
    @harinarayanan5975 3 дня назад

    Great content!!!

  • @royal_dsz
    @royal_dsz 3 месяца назад

    Thanks Vishal, this was very helpful

  • @LMGaming0
    @LMGaming0 5 месяцев назад

    Very simple and well explained, thanks!

  • @zikoalexis2751
    @zikoalexis2751 9 месяцев назад +1

    Thank you for the help

  • @basavrajningadali4919
    @basavrajningadali4919 5 месяцев назад

    i am not able to create composer env

  • @Alfred_vinci
    @Alfred_vinci 9 месяцев назад

    in place of Airflow i want to use Mage ai.

  • @adijos92
    @adijos92 10 месяцев назад

    cloud composer environment showing error and image version not showing while creating environment manually..is their any update

    • @adijos92
      @adijos92 9 месяцев назад

      please reply on that

  • @abhisheknaidu8877
    @abhisheknaidu8877 8 месяцев назад

    i am getting more environment error while connecting data fusion and python code has error

  • @lug__aman
    @lug__aman Месяц назад

    sir i have alot of csv data in my postgreSQL db i want to tranfer that data to bigquery with real-time data stream/processing which service I need to use can you please give me some context, I new in DE my company give me task

    • @techtrapture
      @techtrapture  Месяц назад

      ruclips.net/video/L4Ad7RQYv4o/видео.html

    • @techtrapture
      @techtrapture  Месяц назад

      You can use datastream
      ruclips.net/video/L4Ad7RQYv4o/видео.html

    • @lug__aman
      @lug__aman Месяц назад

      @@techtrapture but postgres me problem ho rhi replication ki kese krna h ye replication

  • @lmarwarl
    @lmarwarl 7 месяцев назад +1

    Amazing video, unfortunately I have problems creating my cloud composer environment, maybe because I am in a free trial.
    I get this error after create the environment:
    CREATE operation on this environment failed 49 minutes ago with the following error message:
    Some of the GKE pods failed to become healthy. Please check the GKE logs for details, and retry the operation.

    • @Abracadanz00
      @Abracadanz00 7 месяцев назад

      I'm having the same issue, any idea how to resolve it?

    • @lmarwarl
      @lmarwarl 7 месяцев назад

      @@Abracadanz00 Nothing yet, but after searching a lot I read a post from Google that says you have to activate your billing account in GCP before creating the cloud composer environment.

    • @paranoya733
      @paranoya733 5 месяцев назад

      @@Abracadanz00 If you want to use shorter free pipeline in this part 14:57 cut off these part: Cloud Composer, Cloud Storage, Cloud Data Fusion, BigQuery, and replace them with free short pipelines: google sheets (data) -> Looker Studio. If you extract API data, in google sheets add extension called "API Connector" configure it (search in youtube) -> looker studio

  • @TeekawinKirdsaeng
    @TeekawinKirdsaeng 8 месяцев назад

    How to use gcloud in vs code?
    Error: gcloud : The term 'gcloud' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct
    and try again

    • @techtrapture
      @techtrapture  8 месяцев назад +1

      Install Google cloud SDK in your system . Use below link
      cloud.google.com/sdk/docs/install#windows

  • @vikascbr
    @vikascbr 4 месяца назад

    Thanks very helpful

  • @selvaarul8258
    @selvaarul8258 7 месяцев назад +1

    awesome video, can you create complete composer airflow video for this one

    • @techtrapture
      @techtrapture  7 месяцев назад

      Seperate playlist for Composer
      Cloud Composer - Airflow on GCP: ruclips.net/p/PLLrA_pU9-Gz22Zml5mxcszG4A9ecqWtd4

  • @yishanzhan6066
    @yishanzhan6066 9 месяцев назад

    I got these errors "Cannot load filesystem: java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.hdfs.web.HftpFileSystem not found. Can not load the default value of `spark.yarn.isHadoopProvided` from `org/apache/spark/deploy/yarn/config.properties` with error, java.lang.NullPointerException. Using `false` as a default value." Any clues on how to fix it?

    • @figh761
      @figh761 8 месяцев назад

      did you fix this

    • @akshaymantena6699
      @akshaymantena6699 6 месяцев назад

      I'm also getting the same error, Did you fix it?

    • @Daswinian
      @Daswinian 2 месяца назад

      I thinks it's permission issue. Try adding the following roles to the compute service account your datafusion uses
      Dataproc Service Agent
      Dataproc Worker
      Editor
      Service Account User

  • @TheIlyasqazi
    @TheIlyasqazi Месяц назад

    Can you please create another video to show how we can download excel data from sharepoint site. And load this data in BigQuery. And make this as daily job. Also it is possible to do this entire process through code using Terraform. Thanks

    • @techtrapture
      @techtrapture  Месяц назад +2

      You came up with project requirements not video 😀

    • @TheIlyasqazi
      @TheIlyasqazi Месяц назад

      😂

    • @TheIlyasqazi
      @TheIlyasqazi Месяц назад

      I heard this kind of real time requirement for many places and many forum. So thought to share with if you could help. But same time I am also trying. Thanks for all educational videos.

    • @sivaramsathiamoorthi87
      @sivaramsathiamoorthi87 26 дней назад

      @@techtrapture Bro please it will be great of you if you provide this 😄🙏

    • @aiwinmanuel7313
      @aiwinmanuel7313 24 дня назад

      I would suggest using automation tools like Blue Prism for this.

  • @asifshaharia2756
    @asifshaharia2756 4 месяца назад

    Im facing some problem. In my cloud fusion some of the field in phone_number, ssn is missing. And data of birth and password column is completely empty. Could you please help me troubleshoot it?

    • @zaraazar5513
      @zaraazar5513 4 дня назад

      Yes It can happen, Ask gpt to give you a clear code

  • @anonymous8038-c4m
    @anonymous8038-c4m 6 месяцев назад +1

    Fusion is not parsing the salary and many fields although they are in the csv

    • @zaraazar5513
      @zaraazar5513 4 дня назад

      They may get stuck by double quotes. Ask gpt to give you clear code without \ or ""

  • @basavrajningadali4919
    @basavrajningadali4919 5 месяцев назад

    not getting mask data option in wrangler

    • @zaraazar5513
      @zaraazar5513 4 дня назад

      try to convert Int of Salary to String datatype

  • @renvils
    @renvils 7 месяцев назад

    Great video as always ! Can you do make a timestamp for this video ?

  • @SasankPasupuleti
    @SasankPasupuleti 29 дней назад

    Do a project for elt as well

    • @techtrapture
      @techtrapture  29 дней назад +1

      Sure ,soon will do it

    • @techtrapture
      @techtrapture  28 дней назад +1

      Here you Go -
      ruclips.net/video/rIUWbSXjKe4/видео.html

  • @abdulfasith7905
    @abdulfasith7905 7 месяцев назад

    Nice video, can you create a pipeline using server / serverless dataproc.?

  • @fatallny
    @fatallny 8 месяцев назад

    thank you!!

  • @promitdutta3029
    @promitdutta3029 10 месяцев назад +1

    composer shows "This environment has errors"

  • @punk77777
    @punk77777 10 месяцев назад

    kindly make this kind of pipeline ETL video with the {GCS-->(COMPOSER---DATAFLOW)--->BIGQUERY}

    • @techtrapture
      @techtrapture  10 месяцев назад +1

      It's already there
      ruclips.net/video/UXJxcWgxwu0/видео.html

    • @VthePeople4156
      @VthePeople4156 10 месяцев назад

      Please explain total project 3-5 sentences for interview purpose
      Like
      what is the flow of project,
      Which gcp services used for project
      How u developed all different modules by using all different GCP services...

    • @Rajdeep6452
      @Rajdeep6452 9 месяцев назад

      @@VthePeople4156 Cant you see and tell? Does he have to spoon feed you now? your parents still wash your ass?

    • @VthePeople4156
      @VthePeople4156 9 месяцев назад

      @@Rajdeep6452 yes

    • @Rajdeep6452
      @Rajdeep6452 9 месяцев назад

      @@VthePeople4156 idiot xD

  • @flosrv3194
    @flosrv3194 7 месяцев назад +1

    its written gcloud is not an executable so your login stuff doesnt work with everyone and you did stuffs before without telling it in video. please next time show everything from scratch, i mean for real, not saying but doing it in reality too

    • @techtrapture
      @techtrapture  7 месяцев назад

      Apologies if I missed. You need to install gcloud/ cloud SDK first to execute your command.

  • @harinarayanan5975
    @harinarayanan5975 3 дня назад

    Great content!!!