Delta Live Tables A to Z: Best Practices for Modern Data Pipelines

Поделиться
HTML-код
  • Опубликовано: 20 ноя 2024

Комментарии • 33

  • @stevequan7306
    @stevequan7306 Год назад +18

    This is the Bible for DLT! Worth to loop and study! Well done🙌

  • @mrliuquantong4943
    @mrliuquantong4943 Год назад +8

    Excellent Demo! Would you please provide the PDF file of this demo as well as the code for us to practise? looking forward to hearing from you.

  • @jonathanduran2921
    @jonathanduran2921 Год назад +23

    Ha, the CEO knowing where the raw data is stored.. almost died laughing there.

    • @hapslab
      @hapslab 11 месяцев назад

      #databricks is an ecosystem now. Helped by all its amazing creators. Proud to be associated since 2015❤

  • @DenCato
    @DenCato Месяц назад

    Almost makes me try DLT in production use cases :)
    Which brings me to my point: Databricks should make it clearer when to use DLT and what the positives, negatives and showstoppers are over the lifecycle of a project.
    At the moment, hardly anyone I know dares to use it in production, unless for rather small specific use cases, and that is mostly due to lack of trust to grow, maintain and debug a DLT based data product.

  • @周中海
    @周中海 Год назад +6

    where can i have the PPT? and demo code?

  • @georges7298
    @georges7298 5 месяцев назад

    Fantastic DLT and pipeline training! well done!. Is there a github project with a complete version of the example codes shown in this video?

  • @jasonkhaihoang781
    @jasonkhaihoang781 15 дней назад

    Can we use CDF (Change Data Feed) for Apply Changes Into in Streaming tables? :)

  • @smedegaardpedersen
    @smedegaardpedersen Год назад +2

    Super good stuff.
    I wonder if the the function call inside the loop @1:13:22 should have been `create_report(r)` instead of `create_table(r)`?

  • @mateen161
    @mateen161 Год назад +1

    Would it be possible to create unmanaged tables with a location in datalake using DLT pipelines ?

  • @henryeleonu6237
    @henryeleonu6237 Год назад +1

    interesting! I now have an idea of what delta live tables can do

  • @周中海
    @周中海 Год назад +2

    question here, why i run the same will get error "16:08:48 Running with dbt=1.6.2
    16:08:49 Registered adapter: databricks=1.6.4
    16:08:49 Unable to do partial parsing because saved manifest not found. Starting full parse.
    16:08:51 Found 2 models, 0 sources, 0 exposures, 0 metrics, 471 macros, 0 groups, 0 semantic models
    16:08:51
    16:14:02 Concurrency: 8 threads (target='databricks_cluster')
    16:14:02
    16:14:02 1 of 2 START sql streaming_table model default.device .......................... [RUN]
    16:14:03 1 of 2 OK created sql streaming_table model default.device ..................... [OK in 0.53s]
    16:14:03 2 of 2 START sql materialized_view model default.device_activity ............... [RUN]
    16:14:04 2 of 2 ERROR creating sql materialized_view model default.device_activity ...... [ERROR in 0.82s]
    16:14:04
    16:14:04 Finished running 1 streaming_table model, 1 materialized_view model in 0 hours 5 minutes and 12.60 seconds (312.60s).
    16:14:04
    16:14:04 Completed with 1 error and 0 warnings:
    16:14:04
    16:14:04 Runtime Error in model device_activity (models/example/device_activity.sql)
    [TABLE_OR_VIEW_NOT_FOUND] The table or view `main`.`default`.`device` cannot be found. Verify the spelling and correctness of the schema and catalog.
    If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog."
    from my understanding the table only can created by DLT pipeline, DBT cannot create the table. but you succesd in create the streaming table and MV. May i know why?

  • @irfana398
    @irfana398 Год назад +3

    Why can't we run the code in the cell for debugging? I have found DLTs have so much limitation and hard to debug.

    • @alirezahassani3767
      @alirezahassani3767 Год назад

      I had been eagerly anticipating the release of this feature for this year. Hopefully, they will add it soon.

    • @michaelarmbrust2076
      @michaelarmbrust2076 Год назад

      We are working on a debugging experience that will be integrated with notebooks.

  • @web3tel
    @web3tel Год назад

    I am not sure I understood the repeating references to the "errors in our docs"? Can you please clarify? What would be a reasone to publish docs with the errors, please? Is there quality control over these docs?

  • @Rothbardo
    @Rothbardo Год назад +3

    anyone have a link to the slides?

  • @TheDataArchitect
    @TheDataArchitect Год назад +1

    43:10 this is awesome man.

  • @SandeepGunda-u5q
    @SandeepGunda-u5q Год назад

    @michaelarmbrust2076 While using apply_changes, how do we handle duplicates in the sequence by column in a stateless way? Does dropDuplicates deduplicate data for the micro-batch like a forEachBatch would? or would it attempt to deduplicate the whole stream unless a watermark is given?

  • @samred-x8s
    @samred-x8s 8 месяцев назад

    Quick Question : If a record is deleted from Source table hard delete how apply_changes cdc will handle ?

    • @jasonkhaihoang781
      @jasonkhaihoang781 15 дней назад

      If the source table is Delta table and have Change Data Feed (CDF) enabled, I think you can propagate the "D" records from the CDF downstream via apply_changes cdc

  • @satyakiguha415
    @satyakiguha415 3 месяца назад

    is there any link to download the slides?

  • @TheDataArchitect
    @TheDataArchitect Год назад

    37:10 no azure storage accounts?

  • @saravananharisamy8085
    @saravananharisamy8085 11 месяцев назад

    Please share the repo for cicd atleast

  • @spitfirexvii
    @spitfirexvii Год назад +1

    John Carmack, is that you?

  • @oleksiy8105
    @oleksiy8105 11 месяцев назад

    Straming=is always costly... If you trigger it manually or on schedule it is not streaming...

  • @jhonsen9842
    @jhonsen9842 7 месяцев назад

    This is the way how you can make Data engineer job easy and pay less to them.

  • @sh1310
    @sh1310 5 месяцев назад +1

    Makes the easy impossible.

  • @VerySeriousMan
    @VerySeriousMan 11 месяцев назад +1

    Hard to follow unless you know a lot already.

  • @msftora3
    @msftora3 9 месяцев назад +2

    just another stereotype reinvention of a wheel