Ask Databricks - about Delta Live Tables (DLT) with Michael Armbrust

Поделиться
HTML-код
  • Опубликовано: 24 ноя 2024

Комментарии • 9

  • @shikokas
    @shikokas Год назад +7

    some of my key takeaways :
    1. 7:38 - “we want to eventually make it possible to run DLT locally outside of Databricks” - Very important good to know !
    2. 15:50 - “we are not the only project kind of in this space ... things like DBT” - there is an ongoing question regarding DLT and DBT are they the same and what are the differences - see answer at 40:29
    3. 17:12 - DLT has better auto scaling then DBX workflows - to bad that they don’t take those capabilities into workflows and allow us to enjoy them
    4. 20:07 - “now we can use Unity catalog with DLT”- well ..that isn’t that simple - the limitation list is very long and Managed locations/tables aren’t supported yet
    5. 22:20 - medallion model is a cool idea but totally made up ... it should be used as a very useful vocabulary for data quality - so don’t obsess about it :slightly_smiling_face:
    6. 26:09 - Enzyme engine is used for materialized view incremental work - doesn’t handle all types of quires - take that into account it will be supported in the long term but no due dates..
    7. 34:12 - always prefer using expressions vs UDF’s
    8. DLT Serverless will have a lot of capabilities - Note - serverless isn’t supported on all cloud vendors yet
    9. to this day there isn’t an option to Debug DLT (as within a workflow notebook) without needing to run the pipeline - per the answers in the Chat this should come in the future

  • @svenerikhaberg4146
    @svenerikhaberg4146 Год назад +2

    Really great initiative, giving insight into both the product and vision of the developers. Thank you so much for sharing and putting time and effort into publishing all your great videos 👌🙏

  • @lukehoughton
    @lukehoughton Год назад

    Great format. Lots of new and interesting things to think about. Looking forward to the next one.

  • @GerardWolfaardt
    @GerardWolfaardt Год назад +2

    Really excited about apply changes from snapshot! Is there a timeline for this feature? I know, I know, but I had to ask!

    • @MartinIsti
      @MartinIsti Год назад

      I can support this, it would be so great to have it available soon at least in some kind of preview!

  • @JD-xd3xp
    @JD-xd3xp Год назад

    I want to see how my architecture diagram will look like when I put Streaming, Materialized Views, Serverless, DLT etc.. out together for ingesting data like structured and semi-structured.

  • @MartinIsti
    @MartinIsti Год назад

    It was mentioned at 14:10 that SCD2 sounds conceptually simple to start with but can get so complicated that all examples around streaming SCD2 needed to be corrected in the docs. I wonder if there is a good conceptual guide about all the possible SCD2 scenarios (with sample source and expected outcome records). Besides the usual update / insert / delete there is "deleted row reinstated" and other interesting ones.

  • @radosawbrygaa7420
    @radosawbrygaa7420 Год назад

    Great video, great series. Please mute next time when ur guest is speaking, cus the echo is super annoying - no problem ;)

  • @Balquo_
    @Balquo_ Год назад

    I'm using SCD2 apply changes. I would like the end date to be updated if there is a hard delete in the raw data. I could use WHEN NOT MATCHED BY SOURCE outside of DLT. But within DLT, it doesn't seem possible. Is it?.