30.Access Data Lake Storage Gen2 or Blob Storage with an Azure service principal in Azure Databricks
HTML-код
- Опубликовано: 14 окт 2024
- In this Video, I discussed about accessing ADLS Gen2 or Blob Storage with an Azure Service Principal using OAuth.
Code Used:
spark.conf.set("fs.azure.account.auth.type.storage-account.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.storage-account.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.storage-account.dfs.core.windows.net", "application-id")
spark.conf.set("fs.azure.account.oauth2.client.secret.storage-account.dfs.core.windows.net", service_credential)
spark.conf.set("fs.azure.account.oauth2.client.endpoint.storage-account.dfs.core.windows.net", "login.microsof...")
Link for Python Playlist:
• Python Playlist
Link for Azure Synapse Analytics Playlist:
• 1. Introduction to Azu...
Link for Azure Data bricks Play list:
• 1. Introduction to Az...
Link for Azure Functions Play list:
• 1. Introduction to Azu...
Link for Azure Basics Play list:
• 1. What is Azure and C...
Link for Azure Data factory Play list:
• 1. Introduction to Azu...
Link for Azure Data Factory Real time Scenarios
• 1. Handle Error Rows i...
Link for Azure Logic Apps playlist
• 1. Introduction to Azu...
#Azure #Databricks #AzureDatabricks
Really explained well
Pls make some interview questions and answers series pls
content explanation is very nice .but one suggestion is that use proper naming conventions . that will give more understanding to new users.
Recent interview questions:
1. If you are using unity catelog in your project then can we use service principals to connect adf to batabricks?
Sir can you please explain in depth.
2. Can we use yarn as cluster manager or resource manager in spark in databricks?
In real time?
Very crisp and clean thank you for this vedio
Thank you so much for your video. It was a much needed help.
love it, very well explained.
Thank you 😊
Thanks for the video; it is very informative. Using this method, do you need to execute the spark.conf.set() commands every time you restart the cluster? My guess is that you would since you are only affecting configs of this specific spark session.
Yes in real time these configurations will be part of your application code and once your cluster restarts,it kills your application due to driver unavailability and you need to start from beginning
I want to know to what is the benefit of using this service principle I'd ,name and value under oauth function while we can access files from blobstorage using just secret scope itself directly. Is there any advantage of using this ?
Hi sir i have been following every video in this db playlist.
could you tell me how many more videos can be there to complete this DB playlist??
Thanks. This helped a ton
Sir, how can we configure Azure data bricks Hive metastore to some external etl tool like informatica . The purpose is to fetch data from hive tables and use databricks engine for push down optimization to improve the performance of the data fetching.
Hi, One question . Dont you have to mount the file system again by using these Azure service principal's configuration ?
I think you are able to read the Data coz your storage is already mounted by direct access keys ??
I think he is not using dbfs mount here and once your are authorized using service principal you can directly read from storage account but yes you can mount your adls to databricks file system once and it is set workspace level and then you can start reading from dbfs directly instead of adls
@@ravikumarkumashi7065 Mounting is a deprecated pattern for storing and accessing data , it's not recommended anymore , using abfs driver is the best way right now! docs.databricks.com/external-data/azure-storage.html
I followed your instructions. Still it is throwing error "Unsupported Azure scheme: abfss". May I know why and what the steps?
Hi, can you treat how to set up a shared external Hive metastore to be used across multiple Databricks workspaces, the purpose is to be able to reference dev workspace data in prod instance
We do not need to mount after setting the spark config ?
connecting to data lake storage videos are confusing, why should we use service principal when we can access through Azure KeyVault directly?
Please make video on polybase and jdbc approach.
What if the storage account has HNS disabled and I still want to use an SPN?
Thank a lot sir
Welcome ☺️
wafa ur the best
10 api access through adf in azure 6 success and 4 failed how to get only failed api?
♥♥