DP-203: 35 - Writing data to ADLSg2 from Azure Databricks

Поделиться
HTML-код
  • Опубликовано: 9 ноя 2024

Комментарии • 12

  • @prabhuraghupathi9131
    @prabhuraghupathi9131 6 месяцев назад +1

    Very useful video to know about the managed table and external table, how to create and to store data back to Azure datalake using external table. Thanks Piotr for this great content!!

  • @abhijitbaner
    @abhijitbaner 2 месяца назад

    Does DBrx MANAGED tables have any advantages over the EXTERNAL tables? like btter performance on read/write etc?

    • @TybulOnAzure
      @TybulOnAzure  2 месяца назад +1

      I'm not aware of any performance differences between those two (or at least I didn't notice any).
      Important difference between them is that if you drop a managed table then it will also drop your data. In case of external table, only the metadata in metastore is removed but the actual data stays untouched (as it is stored externally).

  • @siddharthchoudhary103
    @siddharthchoudhary103 6 месяцев назад +2

    what is the difference between save vs saveAsTable?

    • @TybulOnAzure
      @TybulOnAzure  6 месяцев назад +2

      Both will save the data but the latter will also register it as a table in the catalog.

  • @rabink.5115
    @rabink.5115 3 месяца назад

    Is there any possibility to get access to the databricks notebook you use.

    • @TybulOnAzure
      @TybulOnAzure  3 месяца назад

      Yes, check my GitHub - link is in the video description.

  • @MokhtarBoussaada
    @MokhtarBoussaada 2 месяца назад

    When i tried to create an external table, i was forced to create storage credentiel and external location,

    • @TybulOnAzure
      @TybulOnAzure  2 месяца назад

      What was the code you wrote?

    • @MokhtarBoussaada
      @MokhtarBoussaada 2 месяца назад

      I configured aacess on cluster level
      When i directly save files in data lake, usin spark.write.format("delta").save("path") everything was fine
      But when i tried to create the external table ( using the same create table query as you ) i had an error, external location.. when i created storage credentiels and external location then i re execute the cell, it worked

    • @hernanmartindemczuk
      @hernanmartindemczuk 6 дней назад

      Same here. I had to:
      1- Create an Access Connector for Azure Databricks in the Azure portal.
      2- Using the Access Connector's Managed Identity, create a new Databricks Storage Credentials in the Catalog.
      3- Create an External Location pointing to the ADLS path in the storage account, using the Storage Credentials.
      Then I was able to run the command.
      BTW this managed identity needs Storage Blob Data Reader or Storage Blob Data Contributor permissions on the storage account to work.