Delta Live Tables: Building Reliable ETL Pipelines with Azure Databricks

Поделиться
HTML-код
  • Опубликовано: 31 янв 2025

Комментарии • 56

  • @supriyasharma9517
    @supriyasharma9517 5 месяцев назад +1

    Great video and easy explanation. I hope you come up with a series on step by step on Databricks for beginners like me who are finding to difficult / struggling to make the switch. Thanks for your efforts

  • @samanthamccarthy9765
    @samanthamccarthy9765 Год назад +1

    Awesome thanks so much . this is really useful for me as a Data Architect . much is expected from us with all the varying technology

  • @amadoumaliki
    @amadoumaliki 11 месяцев назад +2

    As usual! Mahit wonderful!

  • @debashisrath2861
    @debashisrath2861 9 дней назад

    Good explanation.

  • @Databricks
    @Databricks Год назад +3

    Nice video🤩

  • @menezesnatalia
    @menezesnatalia Год назад +2

    Nice tutorial. Thanks for sharing. 👍

  • @ananyanayak7509
    @ananyanayak7509 Год назад +2

    Well explained with so much clarity. Thanks 😊

    • @SQLBits
      @SQLBits  Год назад

      Our pleasure 😊

    • @ADFTrainer
      @ADFTrainer Год назад +1

      @@SQLBits Can you provide code. Thanks in advance..

  • @MichaelEFerry
    @MichaelEFerry Год назад +2

    Great presentation.

    • @SQLBits
      @SQLBits  Год назад

      Thanks for watching :)

  • @PravinUser
    @PravinUser 2 месяца назад

    Absolutely nailed it !!!

  • @Rangapetluri
    @Rangapetluri 7 месяцев назад

    Wonderful session. Sensible questions asked. Cool

  • @starmscloud
    @starmscloud Год назад +1

    Learned a Lot from this . Thank You for this video !

    • @SQLBits
      @SQLBits  Год назад

      Glad it was helpful!

  • @priyankpant2262
    @priyankpant2262 10 месяцев назад +1

    Great video ! Can you share the github location of the files used ?

  • @pankajjagdale2005
    @pankajjagdale2005 Год назад +2

    crystal clear explanation thank you so much can you provide that notebook ?

  • @SAURABHKUMAR-uk5gg
    @SAURABHKUMAR-uk5gg 5 месяцев назад +2

    @30:03, if you're defining the schema while creating the table, then why again selecting map(inferschema = True) ?

    • @Rafian1924
      @Rafian1924 2 месяца назад

      good observation. . I also had this doubt

  • @germanareta7267
    @germanareta7267 Год назад +3

    Great video, thanks.

  • @Rafian1924
    @Rafian1924 2 месяца назад

    Awesome 🙏😍

  • @anantababa
    @anantababa 11 месяцев назад +1

    Awesome training, can you please share the data file, i want to try it.

  • @prashanthmally5765
    @prashanthmally5765 9 месяцев назад

    Thanks SQLBits. Question: Can we create a "View" on Gold Layer instead having "Live Table" ?

  • @srinubathina7191
    @srinubathina7191 Год назад +1

    Wow super stuff thank you sir

    • @SQLBits
      @SQLBits  Год назад

      Glad you liked it!

  • @trgalan6685
    @trgalan6685 Год назад +1

    Great presentation. No example code. What's zero times zero?

  • @MohitSharma-vt8li
    @MohitSharma-vt8li Год назад +2

    Can you please provide us the notebook DBC file or ipynb..
    By the way great session,
    Thanks

    • @SQLBits
      @SQLBits  Год назад

      Hi Mohit, you can find all resources shared by the speaker here: events.sqlbits.com/2023/agenda
      You just need to find the session you're looking for and if they have supplied us with their notes etc, you will see it there once you click on it!

    • @MohitSharma-vt8li
      @MohitSharma-vt8li Год назад

      @@SQLBits thanks so much

  • @artus198
    @artus198 Год назад +6

    I sometimes feel, the good old ETL tools like SSIS , Informatica were easier to deal with ! 😄
    (I am a seasoned on premise SQL developer, transitioning into the Azure world slowly).

    • @SAURABHKUMAR-uk5gg
      @SAURABHKUMAR-uk5gg 5 месяцев назад

      that was only good if you are working on a legacy data architecture or so called monolithic architecture. With the amount of data generation growing each day, we need SaaS like Databricks and Snowflake to perform all the data activities.

    • @artus198
      @artus198 5 месяцев назад

      @@SAURABHKUMAR-uk5gg - the whole idea of service principal Id, key , storing them in a vault , was total rubbish architecture.. now they are slowly moving towards Managed Identity... Azure is totally not worth it !

  • @Chandurkar-i2y
    @Chandurkar-i2y Год назад

    For complex rule based transformations how we can leverage it?

  • @walter_ullon
    @walter_ullon 9 месяцев назад

    Great stuff, thank you!

  • @benim1917
    @benim1917 5 месяцев назад

    Excellent

  • @ashwenkumar
    @ashwenkumar 11 месяцев назад

    Does delta live tables in all the layers has filesystem linked to it as like in hive or Databricks ?

  • @saikeerthanakattige617
    @saikeerthanakattige617 5 месяцев назад

    How to intially get started with databricks like creating clusters, data, notebook, how to set up the infrastructure. I am not able to move forward because of that! Please help

    • @Vishnu-Kanth-01
      @Vishnu-Kanth-01 5 месяцев назад

      I guess it helps
      ruclips.net/video/EyJgykIcy_I/видео.html

  • @olegkazanskyi9752
    @olegkazanskyi9752 Год назад

    Is there a video on how data is pulled from the original source, like a remote SQL/noSQL server, or some API?
    I wonder how data is getting to the data lake?
    I assume this first extraction should be a bronze layer.

  • @Chandurkar-i2y
    @Chandurkar-i2y Год назад

    Is there any way to load new files sequentially if bunch of files arrived at a time?

  • @M0RZ3N
    @M0RZ3N 3 месяца назад

    how is this different than using dataframes in pyspark?

  • @lostfrequency89
    @lostfrequency89 10 месяцев назад

    Can we create dependency between two notebooks?

  • @thinkbeyond18
    @thinkbeyond18 Год назад

    I have a general doubt in autoloader . does autoloader required to run in a job or notebook triggering manually .Or no need to touch anything once we written the code as when as the file arrives it will run automatically and processed the files.

    • @Databricks
      @Databricks Год назад +1

      Trigger your notebook that contains your DLT + Auto Loader code with Databricks Workflows. You can trigger it using a schedule, a file arrival, or choose to run the job continuously. It doesn't matter how you trigger the job. Auto Loader will only process each file once.

  • @guddu11000
    @guddu11000 11 месяцев назад

    shoud have showed us how to trobleshoot or debug

  • @supriyasharma9517
    @supriyasharma9517 5 месяцев назад

    can you please provide code for this?

  • @TheDataArchitect
    @TheDataArchitect Год назад

    I don't get the usage of VIEWS between Bronze and Silver tables.

    • @TheDataArchitect
      @TheDataArchitect Год назад

      Anyone?

    • @SQLBits
      @SQLBits  Год назад

      Hi Shzyincu, you can get in touch with the speakers who taught this video via LinkedIn and Twitter if you have any questions!

    • @richardslaughter4245
      @richardslaughter4245 10 месяцев назад +1

      My understanding (as an "also figuring out data bricks" newb:
      * View: Because the difference between bronze and silver in this instance is very small (no granularity changes, no joins, no heavy calculations, just one validation constraint), it doesn't really make sense to make another copy of the table when the view would be just as performant in this case.
      * "Live" view: I think maybe this is required because the pipeline needs it to be a live view to properly calculate pipeline dependencies
      Hopefully that understanding is correct, or others will correct me :)
      My follow up question would be: As I think about that validation constraint, it really seems like in this case it seems functionally identical to just applying a filter on the view. Is that correct? If so, is the reason to use the validation constraint rather than a filter, mostly to keep code consistency between live tables and live views?

    • @anilkumarm2943
      @anilkumarm2943 6 месяцев назад

      You don't materialize as new tables evertime, We sometimes materialize it as views. So minor transformations like changing the type of the field etc.

  • @tratkotratkov126
    @tratkotratkov126 10 месяцев назад

    Hm … Where in these pipelines you have specified that nature of the created/maintained entity - bronze, silver or gold other then the name of the object itself. Also where these LIVE tables are exactly stored - from your demonstration it appear they all live in the same schema / database while in real live the bronze, silver and gold entities have designated catalogs and schemas.

  • @ADFTrainer
    @ADFTrainer Год назад +1

    pls provide code links

  • @Ptelearn4free
    @Ptelearn4free 9 месяцев назад

    Databricks have pathetic UI...

  • @freetrainingvideos
    @freetrainingvideos 6 месяцев назад

    Very well explained, Thanks