jayzern
jayzern
  • Видео 13
  • Просмотров 619 769
What is Dagster? Asset Based Orchestration [2hr full course]
Dagster is a declarative, asset-based orchestrator that redefines the way we think about managing workflows. In this video, we’ll learn about features of Dagster including Assets, Resources, Jobs, Schedules, Partitions and more. We’ll do this by creating an End to End project from scratch using Dagster, data load tool (dlt) and Snowflake. I’m super excited about this one. Hope you’ll learn something new!
Timestamps ⏰
0:00 - Intro
1:45 - System Design
6:30 - Setup Dagster and data load tool (dlt)
21:40 - How to define Assets
40:08 - Resources, code refactor
43:22 - Schedules and Jobs
48:42 - Backfill DAGs using Partitions
54:39 - Sensors and Automaterialization policy
1:09:31 - Final thoughts
Notes ...
Просмотров: 8 481

Видео

Code along - build an ELT Pipeline in 1 Hour (dbt, Snowflake, Airflow)
Просмотров 165 тыс.10 месяцев назад
How to build an ELT pipeline in 1 hour, using industry standard tools such as dbt, Snowflake and Airflow. This is a live coding tutorial, where I’ll walk you through the thinking process, and show you every step. We’ll cover basic data modeling techniques (fact tables, data marts), snowflake RBAC concepts, and how to orchestrate a dbt project using Airflow. Drop down in the comments section wha...
Intro to Amazon EMR - Big Data Tutorial using Spark
Просмотров 35 тыс.Год назад
Edit* Make sure you encrypt your Spark script as you upload it inside S3 (timestamp: 13:42) There's a small typo in line 41 of the code, should be "add_argument" Intro Today we're going to talk about a popular tool in Data Engineering. Amazon EMR is an industry-leading big data platform. It's a really mature service developed way back in 2009, and draws a lot of heuristics from the Apache Hadoo...
Top 5 SQL Interview Questions for Data Engineers
Просмотров 6 тыс.Год назад
A lot of people struggle to learn SQL. When it comes to interviews they feel super anxious, especially in this economy where it's getting 10x harder to find jobs. In this video, I'll show you how to CRUSH your next Data Engineering SQL interviews, through these 5 handpicked questions. We'll focus predominantly on the problem solving aspect. At the end of the video, I'll share my tips & tricks o...
How I would learn Data Engineering (if I could start over)
Просмотров 381 тыс.Год назад
In this video, I’ll share my step-by-step process on how I would learn Data Engineering if I could start over. Data Engineering is a fast emerging field within the Tech industry; where more and more people from traditional data science/software backgrounds are pivoting towards. We’ll cover the fundamentals of Data Engineering, and talk about some advanced topics you’ll need to learn in order to...
Living abroad is HARD (what I learned after 8 years in new york and london)
Просмотров 4,5 тыс.Год назад
When you're traveling across different countries, it's very easy to cherry pick the best parts of different places to create a perfect image in your head. The reality is, living abroad in a foreign country versus going on vacation is two completely separate things. In this video, I highlight some key learnings after living abroad for over 8 years, and share insightful tips on how to ease that t...
MALAYSIA | Asia's Hidden Gem
Просмотров 9 тыс.Год назад
Malaysia is one of THE most underrated countries in Asia 🇲🇾. Last December, I went back home after being away for almost 3 years. I couldn't really find any videos that really showcase how amazing the country is (scenery, local food, culture), so I wanted to make one myself. It was a super hectic trip, trying to film and catch up with friends in only 2 weeks. Who am I? 🙋🏻‍♂️ I'm Jay, I love mak...
Fall Foliage in Vermont
Просмотров 6662 года назад
Road trip from NY to Vermont Thanks to @yarnehermann @andrealee_x @tiffy_le @jd_lassiter @velalu77 and yuki sensei Gear: Canon R6 RF 24-105mm F4-7.1 is STM RF 35mm f/1.8 Macro IS STM Lens Iphone 13 Pro
Los Angeles
Просмотров 9473 года назад
City of angels Featuring @gareygan @lu8296 @junga_julia @derek_chen01 @ivkyoung Gear: Sony A6400 Tamron 17-70mm f/2.8 Sony E 35mm f/1.8 OSS

Комментарии

  • @BTC4444
    @BTC4444 6 дней назад

    Why did you choose airflow over dagster?

  • @HandsomeSmells
    @HandsomeSmells 7 дней назад

    brilliant tutorial, thanks for this!

  • @mayconpires.oficial
    @mayconpires.oficial 10 дней назад

    Thank for rich content!

  • @prajnaaddagarla9085
    @prajnaaddagarla9085 11 дней назад

    Jay good job 🎉

  • @fizzy9756
    @fizzy9756 13 дней назад

    it is really helpful! thanks Jay.🙂

  • @christophergutknecht8683
    @christophergutknecht8683 15 дней назад

    Thanks for this, amazing! Also love the debugging how not everything is correct on first try, that’s really helpful

  • @vaibs2312
    @vaibs2312 16 дней назад

    I was struggling to simplify airflow and DBT integration and this tutorial really helped me get through the finish line. Thank you!

  • @digitalnaturediaries
    @digitalnaturediaries 16 дней назад

    amazing tutorial

  • @tendamolesta
    @tendamolesta 16 дней назад

    I find hard to understand how to use dagster in a more memory friendly way. For example, let's assume I have a bigquery result that has a large amount of rows. Those rows has to be mapped into something else then two assets should take the data and write it another tables. How those kind of operation should be done by using reusable assets or op? Let's say... Asset A does the query and return the iterator (with result() ) An op B takes element by element and transform it and takes it back to another asset that doesn't expect the full data but it works with chunks and write them to db. I couldn't find anything that uses a more stream like approach. Should we pass objects like spark dataframe or it can be done in a easier way with op and asset annotations ? I think dagster, generators and iterators topic between op and asset can be a great topic to discuss about.

  • @tbd4156
    @tbd4156 16 дней назад

    💗

  • @jakubmorawski8
    @jakubmorawski8 23 дня назад

    Is it still worthwhile to learn this when starting a career as a data engineer, especially now that AI is automating almost everything? I'm asking because I'm currently a programmer exploring alternative career paths.

  • @KhanhLe-yv2gg
    @KhanhLe-yv2gg 23 дня назад

    Your guideline is a gem. But the airflow part is not very clear, i deep dive so many times to fix hahaha

  • @benlahcensoufiane1589
    @benlahcensoufiane1589 24 дня назад

    Thank you for this content

  • @santoshkumarchirra5895
    @santoshkumarchirra5895 29 дней назад

    Hi @jayzern, thanks for video. Is the airflow running singular tests as well? Where did we mentioned "dbt test" in the airflow ?

  • @mito_dj
    @mito_dj Месяц назад

    Hey what setup are you using? Keyboard, etc?

  • @mariopazurbieta7717
    @mariopazurbieta7717 Месяц назад

    great video!

  • @jermpoz2971
    @jermpoz2971 Месяц назад

    THIS IS WHAT IDONT LIKE ABOUT I.T BLOGGERS THEY TALKING TOO MUCH OUTSIDE THE TOPIC....IT GETS BORING.

  • @Neeraj-NN
    @Neeraj-NN Месяц назад

    amazing vedio very clear to explain How snowflake,dbt,airflow and cosmos are all linked together to provide data transformation and the orchestration.

  • @Squeed79
    @Squeed79 Месяц назад

    why everybody are listing completely different things with just a comma: apples, cars, planets, pianos....

  • @shr1939
    @shr1939 Месяц назад

    Is it equally useful for Non-IT background candidates.

  • @bhavyashah98
    @bhavyashah98 Месяц назад

    damn dude, you write queries like pro.... great to learn from you.

  • @soleboxy
    @soleboxy Месяц назад

    dude that dlthub is so cool

  • @wilcity
    @wilcity Месяц назад

    Great video! What text editor are you using?

  • @kareemhameed4042
    @kareemhameed4042 Месяц назад

    I am 32 and i am transitioning into Data engineering. I have a basic understanding of Python but i have never touched SqL before. How feasible are my chances to succeed in this?

  • @fluxx23
    @fluxx23 Месяц назад

    Their documentation seems good but this is the kind of thing that helps me learn! Huge thank you for putting this together!

  • @CWNC
    @CWNC Месяц назад

    Thank you! This was exactly what I needed.

  • @EddieVanWilder
    @EddieVanWilder Месяц назад

    This video has the exact answer to my questions as I'm diving into data modeling for analytics. I'm sure everyone doing this for their first time that they will find this video super helpful. Would be cool to see dbt with Cosmos for smoother operation 👌 EDIT: I was literally just getting into the Deployment part of the video, and there you introduce using Cosmos for Airflow. Kudos!!

  • @NtwaliAlain-j1p
    @NtwaliAlain-j1p Месяц назад

    Hi what the support may help the beginner by owner course by self should be possible on data engineer

  • @bazi15
    @bazi15 Месяц назад

    make video related star and dimension modeling

  • @bazi15
    @bazi15 Месяц назад

    make video related star and dimensional modeling

  • @shloktalhar3981
    @shloktalhar3981 Месяц назад

    Tell me one thing , is data engineering good job profile for freshers

  • @meghasingal7082
    @meghasingal7082 Месяц назад

    Very well explained EMR video, thank you

  • @ShreyasSureshDhamore
    @ShreyasSureshDhamore Месяц назад

    Hi I am trying your proect and got stuk here can you here 21:32:24 Unable to do partial parsing because saved manifest not found. Starting full parse. 21:32:25 Encountered an error: Compilation Error Model 'model.DATA_PIPELINE.stg_tpch_orders' (models/staging/stg_tpch_orders.sql) depends on a source named 'tpch.orders' which was not found

  • @ChrisUK70
    @ChrisUK70 Месяц назад

    When ETL came about the Cloud did not exist, I was writing shell scripts and SQL almost 30 years ago to do ETL. Useful video thanks!

  • @bazi15
    @bazi15 2 месяца назад

    i need to learn more from u keep posting

  • @aritra1414
    @aritra1414 2 месяца назад

    Concise and to the point. It was very helpful. Thanks, please show more end to end complex projects like this

  • @dogenature4801
    @dogenature4801 2 месяца назад

    Hi! really enjoy your tutorial, would like to see a tutorial how to create data CI/CD pipeline starting from pulling latest branch, running data test on staging, and deploy changes to production after test is complete since not lot of youtuber explaining this

    • @jayzern
      @jayzern 2 месяца назад

      This is actually a brilliant idea, thanks for the rec!

  • @rileylee2866
    @rileylee2866 2 месяца назад

    very good session, helped me get a much more concrete idea about how those tools look like and how they work together

  • @hasinirajapaksha333
    @hasinirajapaksha333 2 месяца назад

    I don't have degree in data science but I am pursuing degree in software system engineering. Can i get a job

  • @moverecursus1337
    @moverecursus1337 2 месяца назад

    Great VIdeo

  • @Damian-cd2tj
    @Damian-cd2tj 2 месяца назад

    Dude, what do you mean if you could start over? You’re just starting haha

  • @goabc-u2o
    @goabc-u2o 2 месяца назад

    (venv) PS C:\Users\hsrak\Desktop\DataManagemet2 ew\data_pipeline\dbt-dag> brew install astro brew : The term 'brew' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again. At line:1 char:1 + brew install astro + ~~~~ + CategoryInfo : ObjectNotFound: (brew:String) [], CommandNotFoundException + FullyQualifiedErrorId : CommandNotFoundException Please help me

  • @Lhtokbgkmvfknv
    @Lhtokbgkmvfknv 2 месяца назад

    It's beautiful! Thx man!

  • @manifestingthroughmeditati718
    @manifestingthroughmeditati718 2 месяца назад

    I have done beside knowledge in Python and more but I really want to move in to data engineering. Please can you help

  • @TheDataArchitect
    @TheDataArchitect 2 месяца назад

    That was FAST, you are subscribed :D Any vids related to "Amazon Managed Workflows for Apache Airflow"???

  • @RyanSpurr-k5v
    @RyanSpurr-k5v 2 месяца назад

    Good vid but move your facecam out of the terminal

  • @hasnaouiwafae6031
    @hasnaouiwafae6031 2 месяца назад

    I cannot run my dbt project. I’m still a beginner but I do not understand why this happens, considering that my macros directory is empty except for a .gitkeep file: Compilation Error dbt found two macros named "materialization_table_default" in the project "dbt". To fix this error, rename or remove one of the following macros: - macros/materializations/models/table/table.sql - macros/materializations/models/table.sql

  • @nellyoi9831
    @nellyoi9831 2 месяца назад

    thank you, this is great tutorial

  • @SakshiGowda-vl1ke
    @SakshiGowda-vl1ke 2 месяца назад

    hi ! I'm having trouble connecting to snowflake. can someone please help me resolve it . I just started learning dbt and snowflake . Runtime Error Database error while listing schemas in database "dbt_db" Database Error 250001: Could not connect to Snowflake backend after 2 attempt(s).Aborting

    • @krishuynh3337
      @krishuynh3337 14 дней назад

      worth checking your snowflake credentials again, I got the same error due to an incorrect account id

  • @KheireddineAzzez-l3g
    @KheireddineAzzez-l3g 2 месяца назад

    nice, keep going