databricks connect to SQL db
HTML-код
- Опубликовано: 7 сен 2024
- this video provides information on how to connect azure SQL db from datbaricks and transfer the data
here is the code
df = spark.read.csv('dbfs:/FileStore/tables/Legally_Operating_Businesses.csv',header="true")
display(df)
%scala
val schema = "DCALicense string,LicenseType string,LicenseExpirationDate string,LicenseStatus string,LicenseCreationDate string,Industry string,BusinessName string,BusinessName2 string,AddressBuilding string,AddressStreetName string,SecondaryAddressStreetName string,AddressCity string,AddressState string,AddressZIP string,ContactPhoneNumber string,AddressBorough string,BoroughCode string,CommunityBoard string,CouncilDistrict string,BIN string,BBL string,NTA string,CensusTract string,Detail string,Longitude string,Latitude string,Location string"
%scala
import java.util.Properties
val jdbcHostName = "dbricksconnectivity.database.windows.net"
val jdbcPort =1433
val jdbcdbName = "dbricksCon"
val myproperties = new Properties()
val jdbcUsername = dbutils.secrets.get(scope = "bricksScope", key = "sqlusername")
val jdbcPassword = dbutils.secrets.get(scope = "bricksScope", key = "sqlpassword")
myproperties.put("user",jdbcUsername)
myproperties.put("password",jdbcPassword)
val url = s"jdbc:sqlserver://${jdbcHostName}:${jdbcPort};database=${jdbcdbName}"
val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
myproperties.setProperty("Driver", driverClass)
val mydf = spark.read.format("csv").option("header","true").schema(schema).load("dbfs:/FileStore/tables/Legally_Operating_Businesses.csv")
mydf.write.jdbc(url,"license",myproperties)
%scala
import org.apache.spark.sql.SaveMode
mydf.write.mode(SaveMode.Append).jdbc(url,"license",myproperties)
Very nice representation of using JDBC for connecting Azure SQl DB from Azure Databricks.
We can do read and write to azure sql table and other tables in databricks via JDBC. But how can we do update or delete on the same tables?
Is there a reason why you shouldn't do this? I am surprised this isn't encouraged as a best practice which makes me think I'm missing something
If the data volume is high, it may affect your program performance. ADF will be your best choice for such scenarios
Thanks for sharing. I want to delete records from azure sql which prior to 13 months. How can we implement that?
you can use the queries instead of table. i can provide the video later
best
Are you creating any videos reading data from REST api?
Yes it is there in my channel
Hi sir thanks for the vedio.
Now I have a scenario.
We have multiple source and save all these source's login details in excel - which is called control table - with azure key vault name for pwd and security token.
Now using a single pyspark script we have connect dynamically using the azure keyvalut details and load data into adls gen2.
How we can.
Need reply asap.
Thanks in Advance
Ram
Hi,while iam trying to connect databricks to ssms iam getting error like this:The tcp/ip connection to the host and failed to port how to resolve this error.
looks like it is network issue. please make sure you have required access and ip whiltelisted
can we delete table in azure sql database from databricks using scala?
can u plz share that code in description
updated
is there a way to connect databricks from Oracle SQL Developer ?
Didn’t try that. Should be there
How do you write this in Pyspark?
Will provide that soon
Hi Bro
How can I connect azure data studio from databricks and databricks to data lake then datalake to snowflake can you help me
can I know the reason to connect to azure data studio from databricks? I didnt try this method as I dont have any use case