What Is DBT and Why Is It So Popular - Intro To Data Infrastructure Part 3

Поделиться
HTML-код
  • Опубликовано: 14 апр 2022
  • What Is dbt?
    Built by Fishtown Analytics (now dbt Labs), the data build tool or dbt allows data analysts, data engineers and analytics engineers to execute the “transform“ step in the Extract-Load-Transform pipeline.
    They can do this by writing transforms in SQL and executes it against its company’s database.
    Dubbed as the “analytics engineering tool,” dbt is also open source and has become a regular part of the modern data stack of many companies. It also has had a solid support community formed around it.
    Looking to start you're own data engineering/analytics consulting company, then you should check out my new course here
    courses.technicalfreelancerac... - and use the coupon code "deconsult" to get 50% off
    If you enjoyed this video, check out some of my other top videos.
    The Ultimate Guide To Starting A Data Consulting Company In 2024 | Data Consulting 101
    • The Ultimate Guide To ...
    What Is The Modern Data Stack - Intro To Data Infrastructure Part 1
    • What Is The Modern Dat...
    If you'd like to read up on my updates about the data field, then you can sign up for our newsletter here.
    seattledataguy.substack.com/​​
    Or check out my blog
    www.theseattledataguy.com/
    And if you want to support the channel, then you can become a paid member of my newsletter
    seattledataguy.substack.com/s...
    Tags: Data engineering projects, Data engineer project ideas, data project sources, data analytics project sources, data project portfolio
    _____________________________________________________________
    Subscribe: / @seattledataguy
    _____________________________________________________________
    About me:
    I have spent my career focused on all forms of data. I have focused on developing algorithms to detect fraud, reduce patient readmission and redesign insurance provider policy to help reduce the overall cost of healthcare. I have also helped develop analytics for marketing and IT operations in order to optimize limited resources such as employees and budget. I privately consult on data science and engineering problems both solo as well as with a company called Acheron Analytics. I have experience both working hands-on with technical problems as well as helping leadership teams develop strategies to maximize their data.
    *I do participate in affiliate programs, if a link has an "*" by it, then I may receive a small portion of the proceeds at no extra cost to you.
  • РазвлеченияРазвлечения

Комментарии • 100

  • @SeattleDataGuy
    @SeattleDataGuy  2 года назад +2

    If you guys want to learn more about data engineering, then sign up for my newsletter here seattledataguy.substack.com/ or join the discord here discord.gg/2yRJq7Eg3k

  • @michaelperry5442
    @michaelperry5442 Год назад +4

    Wholly goodness thank you. I've been learning dbt as a Software Engineer and have been trying to figure out why it matters at all. This video helped a ton!

  • @yusufs8095
    @yusufs8095 2 года назад +16

    Great video as always Ben! Thank you for consistently providing us with value.

    • @SeattleDataGuy
      @SeattleDataGuy  2 года назад +1

      Thank you! Hopefully you are doing well.

  • @TheElectrocar
    @TheElectrocar Год назад +18

    DBT was a game change for our tableau workbooks. Our tableau workbooks were taking sometimes up to 1-2 minutes to fully load. Once our data engineers added DBT to the mix, the tableau workbooks were taking 3-5 seconds to load. I am not fully sure how all this was done by our data engineers but it was straight up voodoo magic and our end users loved it. Bonus is it made us look good to our 3rd partners who said even their own data team's reports were taking minutes to fully load.

    • @bharathraj4545
      @bharathraj4545 Год назад

      I am new to data engineering.
      I got stuck at the exact same point.
      Can you help me out and provide any information to contact your data engineering team members?

    • @BigQueyrie
      @BigQueyrie 11 месяцев назад +4

      Probably due to the fact that most computations were done in the underlying data sources, and not using the Tableau engine itself. Same can be said about table relationships: aggregating the data to the target granularity helps with that.

    • @bharathraj4545
      @bharathraj4545 11 месяцев назад

      @jean-baptistequeyrie3388 thanks 😊

    • @carldragseth6379
      @carldragseth6379 5 месяцев назад

      This​@@BigQueyrie

  • @victorberishaj
    @victorberishaj 2 года назад +12

    Great video, and the Discord sound got me good at 0:57

    • @SeattleDataGuy
      @SeattleDataGuy  2 года назад +3

      I didn’t even notice that hahahah it was a legitimate discord sound

  • @KahanDataSolutions
    @KahanDataSolutions 2 года назад +10

    Loving your channel and this video. I had a similar start with data pipelines and to me dbt feels like somebody took the frustrations I had with those GUI ETL tools and packaged it up into a better, modern product. Writing code is more fun anyway (IMO). Keep up the great vids!

    • @SeattleDataGuy
      @SeattleDataGuy  2 года назад

      Love your channel! I have watched plenty of your dbt videos!

  • @vladimirdinolov852
    @vladimirdinolov852 2 года назад +4

    It's about time! Excited to watch this.

  • @amitesh-rai
    @amitesh-rai 2 года назад +19

    dbt is a great tool. I would really love to see a video from you comparing Prefect, Dagster, and Airflow and how they fit into modern data stack.

    • @SeattleDataGuy
      @SeattleDataGuy  2 года назад +6

      That would be another great video! I have been starting to dig into dagster and prefect a little more.

  • @avimehenwal
    @avimehenwal Год назад +1

    very high quality content. Helped me understand some long running doubts I had. Thankyou for sharing

  • @RockTrembath
    @RockTrembath 11 месяцев назад +3

    Thanks! I have been sleeping under a rock for two years - appreciate you helping me get up to speed 🙂

  • @0megal2
    @0megal2 2 года назад +10

    We use airflow + DBT in most of our projects now, it's such a good tool

    • @SeattleDataGuy
      @SeattleDataGuy  2 года назад +3

      Awesome. Any tips for future comment readers?

    • @tattarrrrattat
      @tattarrrrattat 9 месяцев назад

      @@SeattleDataGuy I'm from the future...

    • @ravishmahajan9314
      @ravishmahajan9314 7 месяцев назад

      So you may be using airflow for orchestrating and dbt for transformation.
      Can you please tell about airflow? We also have Perfect and Dagster. Do you think they can overtake Airflow in future?

    • @0megal2
      @0megal2 7 месяцев назад

      @ravishmahajan9314 I am not experienced enough with prefect to talk about it, so it depends. In our case we were using Cloud Composer, which is gcp's managed Airflow, so it made things easier because the client infra was already hosted at gcp

  • @andrelsjunior
    @andrelsjunior 2 года назад +6

    You always know what I want to watch. LOL
    startdataengineering is 🔥🔥

  • @lhxperimental
    @lhxperimental 9 месяцев назад +23

    Still don't get what is dbt

    • @danilomenoli
      @danilomenoli 4 месяца назад +2

      It's Jinja with SQL lol

  • @anildangol
    @anildangol 2 года назад +1

    Great Video man!

  •  8 месяцев назад

    I was thinking how can achive alone to break into DE field. Then I came across with your letters, youtube and then discord channel. Now I don't feel alone. Amazing contents. Thank you !

  • @GlowinginTech
    @GlowinginTech Год назад +1

    super helpful content, thank you!

  • @KUBKO17
    @KUBKO17 Год назад

    Greetings from Vancouver. Thank you for this video on DBT. All other ones were 30min+ :)

  • @alexanderlin2022
    @alexanderlin2022 2 года назад +4

    If you want to actually know what's DBT, fast forward to 4:20. But, if you already know a little bit DBT and just want to find more advance information, you can skip this whole video.

  • @alexanderpotts8425
    @alexanderpotts8425 2 года назад +5

    dbt was a breath of fresh air from ssis. I still think ssis was great for its time and still a usable tool in some places. sometimes I miss it... sometimes.

  • @Rex_793
    @Rex_793 2 года назад +1

    thanks for the video ben!

  • @datasqlai
    @datasqlai Год назад +2

    I can correlate with your journey. Started off with using DTS in SQL 2000 and using SSIS first time with SQL server 2005 :)

    • @ScottEdwards2000
      @ScottEdwards2000 Год назад

      same here! always liked DTS better than SSIS anyway. ;-) that said, loving the focus on #SQL that #dbt brought.

  • @aprilleclair6527
    @aprilleclair6527 Год назад +2

    Hi Ben! Your videos are always top-notch and very helpful! I'm wondering what "snp" (?) stands for in this case. dbt would be an awesome tool to learn!

    • @SeattleDataGuy
      @SeattleDataGuy  Год назад +1

      Glad you are enjoying the videos. I believe I said SMB or small and medium business. Its a classification of business sizes.

  • @xXHelsingGamingXx
    @xXHelsingGamingXx Год назад +3

    SQL and JINJA! 🤯 My Mind is blown!

  • @tarunacharya1337
    @tarunacharya1337 Год назад

    Nicely Explained - thanks

  • @javiereduardochaconarevalo3854
    @javiereduardochaconarevalo3854 Год назад +7

    I always struggle remembering that DBT is not only acronym for Dialectical Behavior Therapy

  • @shivabasayyahiremath6802
    @shivabasayyahiremath6802 10 месяцев назад

    @SeattleDataGuy , Thanks for your contribution in help many understand DBT concepts.
    I have implemented an incremental DBT load(merge) with ProstgreSQL, and the target table is partitioned.
    The load is performing too good in lower environments and taking half the time it used to take before partitioning.
    Strangely, when i move the changes to production, its taking double the time it used to before partitioning.
    Same data volume,
    Similar DB configuration
    only difference is, lower env, its server less and prod postgresql is not serverless.
    any thoughts around this would be greatly appreciated.

  • @coding3438
    @coding3438 Год назад +1

    How would dbt replace adf when there to E AND L in dbt? Who’s gonna extract and load data into the warehouse for dbt to be able to transform it?

  • @cmaan4life1
    @cmaan4life1 2 года назад +12

    literally just had an interview yesterday where they asked me if i have experience with dbt and i said idk what that is lol

    • @SeattleDataGuy
      @SeattleDataGuy  2 года назад +3

      Now you have a high level idea, hopefully the start data engineering project is helpful

    • @henshawefiom5883
      @henshawefiom5883 2 года назад

      @@SeattleDataGuy can I have your email please for mentoring

  • @roberbonox
    @roberbonox 10 месяцев назад

    dude how r you, i have the next question, what could i do if i have a stream on snowflake that i want to "consume" in dbt but not creating a physical table or view, instead something live a ephemeral materialization, only to purge the stream and avoid to become stale. I create an ephemeral model and select the stream source but that only create obviously an ephemeral materialization but kind not clean the data on the stream, thoughts??

  • @Pegasus1311
    @Pegasus1311 Год назад

    Thanks 🦌.

  • @peterg4130
    @peterg4130 Год назад +2

    I'd say SQL is a Pro. If you've worked in BI for any length of time then you know some SQL, so adopted DBT will be very smooth.

  • @andreykholkin2737
    @andreykholkin2737 11 месяцев назад +1

    Thanks for the explanation!

  • @karangupta_DE
    @karangupta_DE 2 года назад +1

    Hi Ben, do you think no code etl tools like Informatica cloud are way more popular because it's easier to find resources? Or is it just because iics is very mature in terms of implementation.

    • @SeattleDataGuy
      @SeattleDataGuy  2 года назад +2

      What solutions companies pick is a combination of sales, POCs and the head of data or other high level individual's beliefs on whether to go open source, low code/no code, or 100% custom. There will always be a mix of people. Generally all of the various buckets will find their homes.

  • @mahmoudfahmy123
    @mahmoudfahmy123 2 года назад +3

    Elite thumbnail

  • @FirstNameLastName-fv4eu
    @FirstNameLastName-fv4eu 5 дней назад +1

    This guy is the best example when you spend 10 yrs of your professional life in "super-cheap-money-world" what happens, a smart kid with a very vague idea of the real world :)

    • @SeattleDataGuy
      @SeattleDataGuy  5 дней назад

      You think I am smart shucks. What is the real world to you?

    • @FirstNameLastName-fv4eu
      @FirstNameLastName-fv4eu 5 дней назад +1

      @@SeattleDataGuy explaining the same reason to a Bank where people dont evaluate a technology on "how much money" it has raised. Your generation is just spoiled or scammed by cheap money culture.

    • @SeattleDataGuy
      @SeattleDataGuy  4 дня назад

      Who do you think is responsible for cheap money culture?

  • @thedatadoctor
    @thedatadoctor 5 месяцев назад +1

    OMG Ben, I also learned from Wise Owl Tutorials!

  • @frozenintime
    @frozenintime Год назад

    So it is because its' a watered down ansible to the point someone would only need to know sql?

  • @marcello4258
    @marcello4258 10 месяцев назад

    You want to talk about dbt vs dagster? Airflow is kinda nuts using for ETL imo. Airflow isn’t designed for ETL at all - dagster is.

  • @joshi1q2w3e
    @joshi1q2w3e 2 года назад +13

    I’m a new Data Engineer and I’m scared to ask what is “Jinja” 😅

    • @SeattleDataGuy
      @SeattleDataGuy  2 года назад +2

      Here is a link pypi.org/project/Jinja2/

    • @WallStreetSilver777
      @WallStreetSilver777 Год назад +2

      Templating tool so that you abstract away logic with parameters

    • @adityanjsg99
      @adityanjsg99 Год назад +2

      Template language , used in php and Django

  • @MMMS75
    @MMMS75 4 месяца назад +1

    Best. Thumbnail. Ever. Lol

    • @SeattleDataGuy
      @SeattleDataGuy  4 месяца назад

      hahaha, i def. took this meme from somewhere

  • @TivoKenevil
    @TivoKenevil 2 года назад +4

    My question is at this point are there TOO Many tools?
    What do you see in the market; companies using python frameworks (SQLAlchemy, airflow etc) or no code tools like DBT(not considering SQL) more?
    Edit: i meant SSIS, informática not DBT

    • @SeattleDataGuy
      @SeattleDataGuy  2 года назад +2

      I think we are seeing consolidation. I also think it depends which size of company you are at. I would say Airflow seems to be used more in mid level companies and SMBs it gets harder to manage as you scale.
      The market always fluctuates in terms of tools. So there is an unavoidable contraction coming.

    • @ivani3237
      @ivani3237 Год назад

      ETL tools like frameworks in JS... every week came new

  • @sharadov
    @sharadov 2 года назад +1

    Basically Terraform for Data folk.

  • @garp9433
    @garp9433 5 месяцев назад +1

    if this vid was from late 2021 and early 2022 I would have agreed about vc funding, but recently, not so much.

    • @SeattleDataGuy
      @SeattleDataGuy  4 месяца назад

      Totally agree, looking back at the date, this video was April 2022

  • @thomashass1
    @thomashass1 11 месяцев назад +1

    After watching this video, it is still not clear to me what exactly dbt is doing. Examples would have been great.

  • @garp9433
    @garp9433 5 месяцев назад

    You need more sound proofing in front of you and above you. You can hear surrounding echo or more items in the background or foreground. There's too much echo

  • @tanjaa9092
    @tanjaa9092 7 месяцев назад

    feedback for your blog - the text colour lacks contrast to the background - makes it very uncomfortable to read

  • @bassett_green
    @bassett_green Год назад +1

    Okay i downloaded dbt why is my code still awful

  • @xenofongrigoriadis7547
    @xenofongrigoriadis7547 10 месяцев назад

    Is this yet another "platform independence" obsession driven solution?
    Why should a company, that has decided to go with the Microsoft stack or the Oracle stack or whatever, reject the possibility to use its product specific tools, that give you all the freedom to create your custom ETL processes, while profiting from all specific features of your stack??

  • @atari1040
    @atari1040 Год назад

    Meltano... Such a disaster... Loved by millions....right..

  • @maganzo
    @maganzo 2 месяца назад

    4:23 What is dbt

    • @maganzo
      @maganzo 2 месяца назад

      but i still don't know what it is

  • @pritishukla6433
    @pritishukla6433 Год назад

    Dbt la jazates ka nahi kulthe

  • @Overthought1
    @Overthought1 Год назад

    *watches first 40 seconds* So let me get this straight: you think most people know what DBT is, but it has only recently been understood that VCs can overvalue startups?

  • @jakobullmann7586
    @jakobullmann7586 Год назад +4

    Sorry, but I don’t search for technical topics to see a talking head.