jayzern
jayzern
  • Видео 13
  • Просмотров 509 216
What is Dagster? Asset Based Orchestration [2hr full course]
Dagster is a declarative, asset-based orchestrator that redefines the way we think about managing workflows. In this video, we’ll learn about features of Dagster including Assets, Resources, Jobs, Schedules, Partitions and more. We’ll do this by creating an End to End project from scratch using Dagster, data load tool (dlt) and Snowflake. I’m super excited about this one. Hope you’ll learn something new!
Timestamps ⏰
0:00 - Intro
1:45 - System Design
6:30 - Setup Dagster and data load tool (dlt)
21:40 - How to define Assets
40:08 - Resources, code refactor
43:22 - Schedules and Jobs
48:42 - Backfill DAGs using Partitions
54:39 - Sensors and Automaterialization policy
1:09:31 - Final thoughts
Notes ...
Просмотров: 3 342

Видео

Code along - build an ELT Pipeline in 1 Hour (dbt, Snowflake, Airflow)
Просмотров 113 тыс.7 месяцев назад
How to build an ELT pipeline in 1 hour, using industry standard tools such as dbt, Snowflake and Airflow. This is a live coding tutorial, where I’ll walk you through the thinking process, and show you every step. We’ll cover basic data modeling techniques (fact tables, data marts), snowflake RBAC concepts, and how to orchestrate a dbt project using Airflow. Drop down in the comments section wha...
Intro to Amazon EMR - Big Data Tutorial using Spark
Просмотров 27 тыс.Год назад
Edit* Make sure you encrypt your Spark script as you upload it inside S3 (timestamp: 13:42) There's a small typo in line 41 of the code, should be "add_argument" Intro Today we're going to talk about a popular tool in Data Engineering. Amazon EMR is an industry-leading big data platform. It's a really mature service developed way back in 2009, and draws a lot of heuristics from the Apache Hadoo...
Top 5 SQL Interview Questions for Data Engineers
Просмотров 5 тыс.Год назад
A lot of people struggle to learn SQL. When it comes to interviews they feel super anxious, especially in this economy where it's getting 10x harder to find jobs. In this video, I'll show you how to CRUSH your next Data Engineering SQL interviews, through these 5 handpicked questions. We'll focus predominantly on the problem solving aspect. At the end of the video, I'll share my tips & tricks o...
How I would learn Data Engineering (if I could start over)
Просмотров 340 тыс.Год назад
In this video, I’ll share my step-by-step process on how I would learn Data Engineering if I could start over. Data Engineering is a fast emerging field within the Tech industry; where more and more people from traditional data science/software backgrounds are pivoting towards. We’ll cover the fundamentals of Data Engineering, and talk about some advanced topics you’ll need to learn in order to...
Living abroad is HARD (what I learned after 8 years in new york and london)
Просмотров 4,4 тыс.Год назад
When you're traveling across different countries, it's very easy to cherry pick the best parts of different places to create a perfect image in your head. The reality is, living abroad in a foreign country versus going on vacation is two completely separate things. In this video, I highlight some key learnings after living abroad for over 8 years, and share insightful tips on how to ease that t...
MALAYSIA | Asia's Hidden Gem
Просмотров 9 тыс.Год назад
Malaysia is one of THE most underrated countries in Asia 🇲🇾. Last December, I went back home after being away for almost 3 years. I couldn't really find any videos that really showcase how amazing the country is (scenery, local food, culture), so I wanted to make one myself. It was a super hectic trip, trying to film and catch up with friends in only 2 weeks. Who am I? 🙋🏻‍♂️ I'm Jay, I love mak...
Fall Foliage in Vermont
Просмотров 650Год назад
Road trip from NY to Vermont Thanks to @yarnehermann @andrealee_x @tiffy_le @jd_lassiter @velalu77 and yuki sensei Gear: Canon R6 RF 24-105mm F4-7.1 is STM RF 35mm f/1.8 Macro IS STM Lens Iphone 13 Pro
Los Angeles
Просмотров 9093 года назад
City of angels Featuring @gareygan @lu8296 @junga_julia @derek_chen01 @ivkyoung Gear: Sony A6400 Tamron 17-70mm f/2.8 Sony E 35mm f/1.8 OSS

Комментарии

  • @aliceschirina8191
    @aliceschirina8191 12 часов назад

    Hello, thanks for this tutorial. At the very beginning, when trying to run the "dbt deps" command I'm getting this error : "Encountered an error loading local configuration: dbt_cloud.yml credentials file for dbt Cloud not found. Download your credentials file from dbt Cloud to `C:\Users\a.schirina\.dbt`". I'm using dbt command locally and my profiles.yml in the .dbt folder is data_pipeline: target: dev outputs: dev: type: snowflake account: jpb45436 # User/password auth user: alices password: mypassword role: dbt_role database: dbt_db warehouse: dbt_wh schema: dbt_schema threads: 4 client_session_keep_alive: False Does anyone know the problem?

  •  16 часов назад

    Thank you for the video jayzern. When I push code into Git, should I push code of dbt only, or I need to push all code of dbt-dag ?

  • @brijeshhota550
    @brijeshhota550 21 час назад

    Best tutorial I've seen so far. Was confused between Glue and EMR for a future projects requiring big compute power with control over each node.

  • @anggipermanaharianja6122
    @anggipermanaharianja6122 День назад

    nice

  • @krisandreivelasco6801
    @krisandreivelasco6801 2 дня назад

    Great guide bro...

  • @pkkkpkkk2385
    @pkkkpkkk2385 2 дня назад

    Thank you brother

  • @aminebouita7185
    @aminebouita7185 3 дня назад

    Thanks a lot for this tutorial

  • @southafricangamer7174
    @southafricangamer7174 3 дня назад

    So to my understanding, the singular tests really mean to check if nothing is the result of the query been tested. If the test is true, then nothing equates to the query been tested - Great your data is fine. If false, you should run that query to see what exactly are those rows. Confusing at first but makes sense now.

  • @jamesdeng2780
    @jamesdeng2780 4 дня назад

    What are the versions of pandas and matplotlib used in your project?

  • @ZollMisc-c1w
    @ZollMisc-c1w 4 дня назад

    Garcia Michelle Davis Jose Jones Jason

  • @yagmurkoksal1013
    @yagmurkoksal1013 5 дней назад

    very sincere and very true points, thank you very much

  • @ArunKumar-u1r6n
    @ArunKumar-u1r6n 10 дней назад

    i am impressed with your video can you comment those books which you are referring , it will help me a lot.

  • @AlbertLorraine-t1y
    @AlbertLorraine-t1y 10 дней назад

    Ziemann Bypass

  • @adrianf.9491
    @adrianf.9491 11 дней назад

    Dagster is amazing but rather complex. This course is amazing!!

  • @dominicaleung7329
    @dominicaleung7329 12 дней назад

    Thank you very much. This is very nice and concise tutorial, exactly what I need.

  • @ONeilPoppy-l1k
    @ONeilPoppy-l1k 12 дней назад

    Thompson Timothy Lewis Anthony Lee Richard

  • @abdullahsiddique7787
    @abdullahsiddique7787 13 дней назад

    Is data engineering dead with advent of AI ? What is the future of data engineering careers in your opinion ?

  • @RockefellerBurgess
    @RockefellerBurgess 13 дней назад

    19395 Bryana Station

  • @inadaldaldaldal8231
    @inadaldaldaldal8231 14 дней назад

    Hello, I followed the video and tried to compile and got the this error, please let e know if any one can assist 16:32:38 Running with dbt=1.8.0 16:32:38 Registered adapter: snowflake=1.8.3 16:32:38 Unable to do partial parsing because profile has changed 16:32:38 Unable to do partial parsing because a project dependency has been added 16:32:38 Unable to do partial parsing because a project config has changed 16:32:39 Encountered an error: Parsing Error Error reading oms_dbt_proj: staging\tpch_source.yml - Runtime Error Syntax error near line 9 ------------------------------ 6 | schema: tpch_sf1 7 | tables: 8 | - name: orders 9 | columns: 10 | - name: o_orderkey 11 | tests: 12 | - unique Raw Error: ------------------------------ while parsing a block collection in "<unicode string>", line 8, column 7 did not find expected '-' indicator in "<unicode string>", line 9, column 7

  • @ShirleyWheeler-j2o
    @ShirleyWheeler-j2o 15 дней назад

    Gonzalez Margaret Clark Linda Brown Eric

  • @christophercampo9099
    @christophercampo9099 17 дней назад

    Thank you, thank you THANK YOU! This was so helpful, easy to follow and made perfect sense.

  • @abderrahmanehamim6692
    @abderrahmanehamim6692 17 дней назад

    Thank you very much

  • @LiamAlixsons-o1b
    @LiamAlixsons-o1b 19 дней назад

    Miller Barbara Thomas Sarah Williams Edward

  • @Goku-ev9np
    @Goku-ev9np 21 день назад

    Im 23 and switching careers from aviation maintenance mechanic to the tech industry trying to get into the data field as i’ve heard the phrase “data is king” i was debating between coding and cybersecurity but i feel like data is the best spot and data engineering sounds like my niche question is should i go to a regular community college (BCC for me in south Florida) or a university aswell or can i break into the field through certifications through technical/vocational school and what certifications are baseline i should go for im trying to find the most efficient and fastest way to break in considering my age (23) im fully committed to this aswell

  • @ZyklonB-88
    @ZyklonB-88 21 день назад

    why do you need to create a VPC?

    • @etf_chach
      @etf_chach 6 часов назад

      VPC is for nodes. It allows them to communicate between each other and the master node.

  • @GarrettSchwarzenbach-u9v
    @GarrettSchwarzenbach-u9v 21 день назад

    Roberts Isle

  • @WebbIsaac-l5h
    @WebbIsaac-l5h 25 дней назад

    Johnson Dorothy Martinez Susan Hall Anna

  • @ktswjp
    @ktswjp 26 дней назад

    Hi, thanks for the top view of the Dagster, do you plan to do the next one, but how to test the whole pipeline or their elements?

    • @jayzern
      @jayzern 25 дней назад

      Honestly still thinking! If there's enough interests on more Dagster videos. Hard to make videos when I'm working full time 😅

  • @ktswjp
    @ktswjp 26 дней назад

    In the video at this timestamp ruclips.net/video/Xe8wYYC2gWQ/видео.htmlfeature=shared&t=675, I noticed you're using `setuptools` in `setup.py` instead of relying on a `requirements.txt` file. I'm curious-what are the advantages of using `setuptools` over the more common `pip` or `pipenv` approaches? Many of the packages you listed seem to be available via `pip`, so it seems to add a bit of complexity. Could you explain the reasoning behind this choice? By the way, I’m not criticizing the approach, just genuinely interested in understanding the benefits.

    • @jayzern
      @jayzern 25 дней назад

      Hey this is a really interesting question `requirements.txt` lists your packages you want to install using pip, but it doesn't describe how the package is installed. The filename is arbitrary, and u can even call it `another_requirements.txt` and run pip install -r `setup.py` and `setuptools` is an alternative approach that installs pip dependencies + how you define a python package (name, metadata, packages, etc). It's more narrow in the sense that you're building for a single project only, and it's meant for redistributing your software on other machines, whereas `requirements.txt` is more suited for development environments.

  • @AdamArzemy
    @AdamArzemy 26 дней назад

    Omg yr Malaysia!

  • @MarionWilliams-m7x
    @MarionWilliams-m7x 27 дней назад

    Brown Gary Martinez Dorothy Garcia James

  • @mahmoudfadaly8074
    @mahmoudfadaly8074 28 дней назад

    the type of video that makes me wanna quit the field because of how bad i feel about the level I am in , but its a very helpful video though

  • @AnthonyTouset
    @AnthonyTouset 29 дней назад

    This is a great video. Thanks for providing real value.

  • @jeahyunkim3141
    @jeahyunkim3141 Месяц назад

    thank you!! I watched the RUclips demo and it was really helpful. I also want to study spark on eks

  • @gpt_Lucifer
    @gpt_Lucifer Месяц назад

    🙏

  • @corbanb
    @corbanb Месяц назад

    Jay! Thanks for the video and content very cool to see. Curious why Airflow over something like FiveTran besides the ability to self host? Any gotchas?

    • @jayzern
      @jayzern Месяц назад

      FiveTran is not really an orchestration tool - it's really meant for the "Extract Load" part only. It's great because of Unix philosophy, i.e. "do one thing, do one thing well only", whereas Airflow is more of a generalist, task-based orchestrator. Another thing is FiveTran is super expensive, unless you're working on something enterprise-y

  • @princeprasanth6310
    @princeprasanth6310 Месяц назад

    bro is so busy in guiding us, he forgot to drink 4 liters of water everyday

  • @Halasaar
    @Halasaar Месяц назад

    Currently working full time while I get my degree I would love to get a job in Database engineering when I graduate helpful info ty!

  • @LastOneStandingg
    @LastOneStandingg Месяц назад

    Does This applies for Freshers also?

  • @uppinder
    @uppinder Месяц назад

    26:00 item_discount_amount is supposed to be negative because the macro defined it as such. I also checked the data on snowflake and they're all negative amounts. Did I miss something?

  • @albertcampillo
    @albertcampillo Месяц назад

    Hi @jayzern, thanks a lot for your video, really valuable content!

  • @CosmicNomad
    @CosmicNomad Месяц назад

    This is such an amazing video @jayzern! The project taken was not overly complex but also not barebones and covered a lot of important stuff! Thanks for being thoughtful and including the code along link (else some of formatting issues would have bugged many newbies)! I think you should keep creating more videos as you are a good teacher. Only suggestion I have is may be include a bit more explanation, which will help beginners even more! Kudos!

  • @tianhockwoo3025
    @tianhockwoo3025 Месяц назад

    Hello did anyone else face this error at Airflow after @32:50 Broken DAG: [/usr/local/airflow/dags/dbt-dag.py] Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/cosmos/operators/base.py", line 361, in __init__ self.full_refresh = full_refresh ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/airflow/models/baseoperator.py", line 1198, in __setattr__ if key in self.__init_kwargs: ^^^^^^^^^^^^^^^^^^ AttributeError: 'DbtRunLocalOperator' object has no attribute '_BaseOperator__init_kwargs'. Did you mean: '_BaseOperator__instantiated'? please send help

    • @CosmicNomad
      @CosmicNomad Месяц назад

      I am facing the exact same error. Please post a reply, if you were able to figure out the fix. I'll do the same if I find a solution.

    • @CosmicNomad
      @CosmicNomad Месяц назад

      Ok, so I think I was able to find the thread related to this issue.. Its still open as of 8/18/2024 11pm PT.. github.com/astronomer/astronomer-cosmos/issues/1161

  • @GeorgeNyamao
    @GeorgeNyamao Месяц назад

    Thanks @jayzern. This tutorial is awesome. I will be recommending it to folks who struggle with connecting dbt with any database engine.

  • @SteynGun-n2u
    @SteynGun-n2u Месяц назад

    hi guys kindly help me out, does only snowflakes and dbt is enought are i have to learn hadoop, spark etc i am working as data analyst for last 1 year and planning to switch to de

  • @RyanRichardson-c8d
    @RyanRichardson-c8d Месяц назад

    "Death by a thousand microservices" comes to mind lol

  • @JuandeSouzaVargasCosta
    @JuandeSouzaVargasCosta Месяц назад

    I'm starting my learning journey from here. It's a completely new world for me, both because I'm not a native English speaker and because I'm totally new to all this technology. I've made up my mind: I want to become a Data Engineer, and I'm going to work really hard to achieve that! Thank you so much for the guidance; I’m going to put it into practice now.

  • @mohitupadhayay1439
    @mohitupadhayay1439 Месяц назад

    AN ABSOLUTE GOLDMINE OF AN INFORMATION WHICH NOT AY UDEMY OR RUclips TUTOR HAS PROVIDED YET!

  • @SaurabhKrPathak
    @SaurabhKrPathak Месяц назад

    Just for the information to all the learners this is not how things to be done in tech industries....you need to understand Terra form scripts along with jenkins which deploys aws services....you will not get access to go on management console and play around and do stuff.

  • @EstebanHenryG
    @EstebanHenryG Месяц назад

    Hi Jay, Im from Latin America, im going to start this month a career related to Web developing and design, I speak english really well and got XP as a field engineering in many places, how can i apply to a job in US/EU from SouthAmerica? Any tip will be considered, Ty!