Stephanie Rivera
Stephanie Rivera
  • Видео 98
  • Просмотров 55 169

Видео

Decision Makers Get Answers Asking English Questions 2024.05.01
Просмотров 113Месяц назад
See how decision-makers can quickly get answers by asking questions in English. ► Speaker - Hobbs www.linkedin.com/in/iamhobbs #databricks #genai #geniespaces
Have a conversation with your data for Excel Users 2024.04.30
Просмотров 125Месяц назад
Quick Introduction to Databricks Genie Spaces for Excel Users
How to enable firewall support for your Azure workspace storage account 2024.05.30
Просмотров 264Месяц назад
By default, the Azure storage account for your workspace accepts authenticated connections from all networks. This video will walk you through enabling workspace storage firewall support on an existing workspace to block public network access. This feature is generally available (GA) to all Azure Databricks customers. ► Speaker - Scott Grzybowski www.linkedin.com/in/scottgrzybowski/ #databricks...
Mastering the SparkUI on Databricks 2024.04.30
Просмотров 3262 месяца назад
Have you had issues with diagnosing cost and performance issues on Spark? Fear not, a document by Peter Stern, a Specialist Solutions Architect at Databricks, goes over this in detail. Watch this video to learn different ways to debug issues on Spark Databricks. ►[Documentation] Learn more about diagnosing cost and performance issues using the Spark UI here - docs.databricks.com/en/optimization...
How to change your Databricks workspace networking configuration on AWS 2024.05.09
Просмотров 2332 месяца назад
Are you a platform administrator wanting to understand more about network configuration on Databricks AWS? If yes, then this video by JD Braun, a Senior Specialist Solutions Architect at Databricks, would be the right one for you! In just under 25 min, you’ll know how to change the network configuration whether it's to remove a subnet or change the CIDR range. The demo uses Terraform and the ac...
10 min Step by Step guide to Evaluating LLMs with MLflow! - 2024.04.29
Просмотров 6612 месяца назад
In this video, Colton Peltier, a Staff Data Scientist at Databricks, will talk about MLflow’s evaluating capabilities pertaining to GenAI in just 10 min! This video will specifically talk about evaluating 3 different LLMs for a task and will help users determine what LLM is performing the best. Pre-built metrics that come with MLflow custom metrics that can be built in are used in this demo and...
Create a DBRX-based Gen AI Agent in 20 minutes! 2024.04.04
Просмотров 1,5 тыс.3 месяца назад
Have you always wanted to create a Gen AI agent yourself but didn’t know how to? You’re at the right place. In this video, Arthur Dooner and Robert Mosley, Specialist Solutions Architects at Databricks, will help you create a DBRX based Gen AI agent in just 20 min! DBRX is a new state of the art open LLM model developed by Databricks which is one of the fastest models currently available. Read ...
AutoML on Databricks - 2024.03.15
Просмотров 2153 месяца назад
In this video, Jyotsna Bharadwaj, a Solutions Architect at Databricks, will go over how AutoML works. Automated ML (AutoML) is a fully automated model development solution seeking to democratize machine learning. AutoML helps to automate the ML process from data to model selection. This video starts from the basics with Jyotsna beginning to differentiate between AutoML and traditional ML soluti...
Use Agent Studio to build a GenAI Agent in minutes!! 2024.03.11
Просмотров 9264 месяца назад
In this video, Arthur Dooner and Robert Mosley, Specialist Solutions Architects at Databricks, will cover the creation of a chatbot on Databricks powered by Databricks quickly that’ll interact with your end points. The chatbot used in the video is capable of using RAG to retrieve API documentation and can execute API calls against the current Databricks endpoints. Learn how to use Agent Studio ...
Introduction to Databricks Data Intelligence Platform in 2024! - 2024.03.05
Просмотров 9314 месяца назад
In this video, John Ward, a Senior Solutions Architect at Databricks, will go over the Data Intelligence Platform. Databricks is a leading Data and AI company that provides a unified, standardized platform for data engineering, data science, and machine learning. It is known for its Lakehouse Platform, Data Lake Data Warehouse, which unifies data, analytics, and AI. This platform allows data te...
Azure Databricks Networking Security (Part 1) - 2024.02.02
Просмотров 9585 месяцев назад
In this video, Arthur Dooner, a Senior Specialist Solutions Architect at Databricks, will cover networking security on Azure Databricks. Arthur covers security from the basics and talks about the control and data plane, serverless, deployment using public and private subnets and has a demo with step-by-step process. You’ll be able to build your foundational knowledge for securing Azure Databric...
State Schema Evolution in PySpark using applyInPandasWithState - 2024.01.25
Просмотров 3995 месяцев назад
In this video, Craig Lukasik, a Senior Specialist Solutions Architect at Databricks, will cover state schema evolution in streaming. Delta Lake handles schema evolution. But what if your state which is used in stateful Structured Streaming needs to evolve? This video helps you understand the nuances of schemas in stateful Structured Streaming and provides a strategy for evolving state schema. T...
Deploying Scaleable Databricks Infrastructure with Terraform - 2024.01.24
Просмотров 5325 месяцев назад
Are you looking at scaling Databricks Infrastructure with Terraform but aren’t sure where to start? Then this video is for you! In just under 20 min, JD Braun and Tony Bo, Specialist Solutions Architects at Databricks will cover common issues and solutions, and methods to scale the structure for the long run. Target Audience - DevOps Engineers, Platform Administrators, Data/Solutions Architects...
Excel to Databricks - Getting to robust data insights in 15 minutes 2024.01.04
Просмотров 3586 месяцев назад
Do you have a lot of data in excel and are looking for getting meaningful and intelligent insights on your data? Then Databricks can help you! In this video, Olivia Zhang, a Solutions Architect at Databricks, will go over how users can gain robust data insights using Databricks in just under 15 min! This will help users achieve reliable, secure and low latency results which are reusable. All of...
Managed Tables vs External Tables in Unity Catalog - 2023.11.03
Просмотров 9668 месяцев назад
Managed Tables vs External Tables in Unity Catalog - 2023.11.03
Lakehouse Federation - Querying data in other warehouses 2023.11.02
Просмотров 2808 месяцев назад
Lakehouse Federation - Querying data in other warehouses 2023.11.02
Systems Tables - 10.27.2023
Просмотров 3168 месяцев назад
Systems Tables - 10.27.2023
Databricks' Project 1b: English as a programming language - 2023.10.17
Просмотров 2639 месяцев назад
Databricks' Project 1b: English as a programming language - 2023.10.17
Quick Intro to Lakeview Dashboards (under 10 minutes!) - 2023.10.16
Просмотров 7289 месяцев назад
Quick Intro to Lakeview Dashboards (under 10 minutes!) - 2023.10.16
Databricks Asset Bundles - 2023.10.16
Просмотров 1,9 тыс.9 месяцев назад
Databricks Asset Bundles - 2023.10.16
Dive into Streaming Checkpoints & Best Practices - 1.25.2023
Просмотров 8719 месяцев назад
Dive into Streaming Checkpoints & Best Practices - 1.25.2023
Databricks Generative AI - Large Language Models - HD 1080p
Просмотров 79810 месяцев назад
Databricks Generative AI - Large Language Models - HD 1080p
Migration from Hive Metastore to Unity Catalog - 2023.08.30
Просмотров 3,7 тыс.10 месяцев назад
Migration from Hive Metastore to Unity Catalog - 2023.08.30
DLT UDFs UC Oh MY! - 08.16.2023 - HD 1080p
Просмотров 38310 месяцев назад
DLT UDFs UC Oh MY! - 08.16.2023 - HD 1080p
Finding your way with MapInPandas() - 08.09.2023 - HD 1080p
Просмотров 39711 месяцев назад
Finding your way with MapInPandas() - 08.09.2023 - HD 1080p
LLMs, Dolly, and ChatGPT: a brief history of Natural Language Processing models - 2023.08.02 - HD
Просмотров 39211 месяцев назад
LLMs, Dolly, and ChatGPT: a brief history of Natural Language Processing models - 2023.08.02 - HD
Analyzing CloudTrail Logs with Databricks on AWS - 07.26.2023 - HD 1080p
Просмотров 27611 месяцев назад
Analyzing CloudTrail Logs with Databricks on AWS - 07.26.2023 - HD 1080p
Databricks + Power BI: Design Best Practices - 2023.07.12
Просмотров 2,9 тыс.Год назад
Databricks Power BI: Design Best Practices - 2023.07.12
History of Hadoop and Big Data - 06.21.2023 - HD 1080p
Просмотров 303Год назад
History of Hadoop and Big Data - 06.21.2023 - HD 1080p

Комментарии

  • @wodfest
    @wodfest День назад

    Is there a part 2 yet?

  • @vedakalluri
    @vedakalluri День назад

    Cant wait for Part -2 now that serverless is GA

  • @ShrikanthP-je4ki
    @ShrikanthP-je4ki Месяц назад

    This video is sufficient to make perfect design considering Power BI and Databricks !!! thanks a lot !!! I appreciate your details and crystal-clear explanations

  • @makportal
    @makportal Месяц назад

    This is a great video, thanks for sharing! Did part 2 ever get recorded?

  • @benjaminnewman3833
    @benjaminnewman3833 2 месяца назад

    is there a part 2, this is really helpful

  • @TusharHatwar-pp7uf
    @TusharHatwar-pp7uf 2 месяца назад

    Can we get slides use in this video?

  • @tusharhatwar
    @tusharhatwar 2 месяца назад

    Where can i get the slides used in this video

  • @damolaakinleye101
    @damolaakinleye101 2 месяца назад

    Saw this on reddit the other day. Thanks for sharing the video. It would be lovely to see a spark query being tuned live with all of these feature. Performance tuning can be a little bit of a black art in Spark.

  • @marz_nana
    @marz_nana 2 месяца назад

    Hi Stephanie, thanks for the video. i am currently using DLT with apply changes and write output to hive metastore, which has AWS glue connect with it. The output is a streaming table, however, it is actually a view build from __apply_changes_storage_xxx table. Any idea how this could be migrate from hive to UC? Also, when i change the same DLT pipeline target to a UC schema, it seems AWS glue is not able to get the table meta. Is there any documentation i can follow for DLT build table migrate from hive to UC? Thanks

  • @josebellido4676
    @josebellido4676 2 месяца назад

    Nice demo!, In the video it is not clear how to access the UI in Databricks to create the agent. Can I get some help on this please?

  • @akashghadage5377
    @akashghadage5377 2 месяца назад

    Thanks for wonderful session.. one more on Checkpoint wrt cloudFile i.e Autoloader much needed .

  • @akashghadage5377
    @akashghadage5377 3 месяца назад

    Thanks for the the databricks skill builder series.

  • @SuhasYogish
    @SuhasYogish 3 месяца назад

    Love the video! Definitely gonna try this out. Could you help me understand how you start the UI builder shown during 26:20?

  • @allthingsdata
    @allthingsdata 3 месяца назад

    Simply excellent. More of this!

  • @anirbandatta1498
    @anirbandatta1498 3 месяца назад

    Hi, This is nicely created demo. Where can I get the notebooks please?

  • @bajrangijha
    @bajrangijha 3 месяца назад

    Hey Stephanie, I migrated everything from hive_metastore to unity just now, but when I'm executing my pipelines it's throwing class and library errors. I had the same libraries installed which were in the old clusters. In fact I edited the old cluster and changed the mode to "shared" in order to make it unity. The same libraries work fine in the old cluster. Do you happen to know what I'm missing here.

    • @stephanieamrivera
      @stephanieamrivera 3 месяца назад

      Can you reach out to your account team? I don't know whats going on.

    • @bajrangijha
      @bajrangijha 3 месяца назад

      @@stephanieamrivera Thanks for replying. The issue was resolved actually. In shared mode, it does not support some of the APIs and spark context according to the documentation. So we used single-user, multi-node cluster and it's all working fine. Thanks.

  • @aysegulcayiraydar1875
    @aysegulcayiraydar1875 4 месяца назад

    If we scan the data via spn, should we define spn as an admin in dbx to get access token for it?

    • @stephanieamrivera
      @stephanieamrivera 4 месяца назад

      To configure Service Principal Names (SPN) for accessing resources in Azure Databricks, you typically need to define the SPN as an admin in Databricks, but it isn't required to obtain an access token. Here's the typical flow to use an SPN to access data in Databricks: 1. Create an SPN: Firstly, you need to create an SPN in Azure Active Directory (AAD) and provide the necessary permissions for accessing the resources you require. 2. Assign Permissions: Assign the appropriate permissions to the SPN, such as the necessary roles or access policies to access the Databricks workspace and other resources. 3. Configure Databricks: As an admin in Databricks, you will configure the SPN by creating a secret scope and storing the SPN credentials securely. This can be done using the Databricks CLI or the Databricks UI. 4. Access Tokens: To obtain an access token for the SPN, you can use the Azure Active Directory authentication flow, such as OAuth2 or client credentials, to authenticate and generate the access token. This token will be used to authenticate the SPN when accessing the Databricks resources.

  • @AlexanderBishop-cadent
    @AlexanderBishop-cadent 4 месяца назад

    Do you have a link to the ETL pipeline step by step process?

    • @stephanieamrivera
      @stephanieamrivera 4 месяца назад

      Is this what you were looking for? databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/8599738367597028/2070341989008551/3601578643761083/latest.html.

  • @iamaashishpatel
    @iamaashishpatel 4 месяца назад

    What does the Customer VPC NACL resemble?

    • @stephanieamrivera
      @stephanieamrivera 4 месяца назад

      The Customer VPC NACL is a security feature in AWS that functions as a virtual firewall for controlling inbound and outbound traffic at the subnet level. It resembles a set of rules that determine what traffic is allowed or denied in a VPC. Here are some key aspects and characteristics of the Customer VPC NACL: 1. Associations: A VPC NACL is associated with one or more subnets within a VPC. By default, each subnet in a VPC is associated with the default VPC NACL, but you can associate a custom NACL with your subnets. 2. Numbering: Each VPC NACL rule is assigned a rule number that determines the order in which rules are evaluated. Rule numbers can be either explicit (specified by the user) or implicit (automatically assigned by AWS). 3. Inbound and outbound rules: VPC NACLs have separate sets of rules for inbound and outbound traffic. Inbound rules control incoming traffic to the subnet, while outbound rules control outgoing traffic from the subnet. 4. Allow and deny rules: VPC NACLs can have rules that either allow or deny traffic. The rules are evaluated in order, and the first matching rule determines whether the traffic is allowed or denied. 5. Stateless: VPC NACLs are stateless, which means that responses to allowed inbound traffic are not automatically allowed outbound. Separate rules must be created for inbound and outbound traffic. 6. Default rules: By default, a VPC NACL allows all inbound and outbound traffic. You can modify the default rules to tighten security or create custom rules to fit your specific requirements. 7. Logging: VPC NACLs can be configured to log accepted and denied traffic, which helps in monitoring and analyzing network traffic patterns. It's important to note that the VPC NACL operates at the subnet level and provides a basic level of security. For more granular control and advanced security features, it is recommended to use Network Security Groups (NSGs) in conjunction with VPC NACLs. Reference: AWS Documentation on VPC NACLs - docs.aws.amazon.com/vpc/latest/userguide/vpc-network-acls.html. Hope this helps!

  • @jeromedupourque6067
    @jeromedupourque6067 4 месяца назад

    Thank you Arthur great video! could you tell me if it is possible to download your architecture diagrams somewhere? Thank you

    • @stephanieamrivera
      @stephanieamrivera 4 месяца назад

      Unfortunately, RUclips doesn't let me upload files or images. It might be easier for you to take screenshots from the video. Sorry about that!

  • @rickrofe4382
    @rickrofe4382 5 месяцев назад

    Thanks for this session it has been very useful , keep it going.

  • @krishnakoirala2088
    @krishnakoirala2088 5 месяцев назад

    Since DLT displays counts on each box, is it usually slower than regular workflow? With the more enhanced features of Unity catalog came/coming in specifically, lineage and such we can easily see what (tables, views) is connected to where. Is it worth using DLT in the workflow if someone does not want to pay extra cost associated with it, considering that I will do the optimize and Z-ordering by my own in some frequency?

    • @stephanieamrivera
      @stephanieamrivera 4 месяца назад

      The performance of DLT compared to a regular workflow depends on various factors, including the specific use case, data volume, query patterns, and optimization techniques used. While DLT introduces some overhead due to its real-time change data capture capabilities, the benefits it provides may still make it worth considering, even without utilizing all of its enhanced features. 1. Performance: DLT may have some additional processing overhead compared to regular workflows due to change tracking and maintaining transactional integrity. However, DLT's optimizations, such as indexing, caching, and predicate pushdown, can help mitigate this impact. If you leverage these optimizations effectively, the performance difference might be minimal. 2. Enhanced features: Unity catalog, with its lineage and other capabilities, can provide valuable insights into data connections and lineage. These features can enhance data understanding, data governance, and debugging processes. 3. Optimize and Z-ordering: Delta Lake provides various optimization techniques, including optimizing layout and improving data locality by leveraging Z-ordering. If you can incorporate these optimizations effectively into your regular workflow without using DLT, you can still achieve performance benefits without incurring the additional cost associated with DLT. In summary, using DLT depends on the specific requirements and constraints of your use case.

  • @maximilianschmitz8737
    @maximilianschmitz8737 5 месяцев назад

    It always says that the token isn't correct or has not the richt permissions. However, my PAT has admin permissions on that workspace. Have you had this issue?

    • @stephanieamrivera
      @stephanieamrivera 4 месяца назад

      If you are experiencing issues with the personal access token (PAT) not being recognized or not having the right permissions, there are a few troubleshooting steps you can try: 1. Verify token permissions: Confirm that the PAT has the necessary permissions assigned within the Databricks workspace. Although you mentioned that the PAT has admin permissions, make sure it has the required permissions specifically for the actions you are trying to perform. For example, if you are accessing Delta tables, ensure that the PAT has the necessary permissions for table operations. 2. Check workspace configuration: Verify that token-based authentication is enabled in your Databricks workspace and that there are no restrictions or configurations that could prevent the use of tokens. Contact your workspace administrator to confirm the token settings and make sure there are no conflicts or restrictions. 3. Try with a new token: If all else fails, you can try revoking the existing PAT and generating a new one. Sometimes, there can be issues with specific tokens, so generating a fresh token may resolve the problem.

  • @AlanBorsato
    @AlanBorsato 5 месяцев назад

    Thanks, Arthur

  • @damolaakinleye101
    @damolaakinleye101 5 месяцев назад

    Thanks for this. We need more of this ☺

  • @allthingsdata
    @allthingsdata 5 месяцев назад

    Excellent resource. I wish there was a longer session on Databricks Terraform with an e2e walkthrough but this is a good overview.

  • @ghyootceg
    @ghyootceg 5 месяцев назад

    Fantastic video ! Makes me wonder what should and shouldn't be built using Databricks SQL. As per my understanding, in this video, a balance is suggested between the gold layer and PowerBi.

  • @Pixelements
    @Pixelements 6 месяцев назад

    Great Video, we are going to migrate from typical Data Warehouse to Lakehouse. Only thing that you did not mention (or I did not understand) is how to serve the Data for PowerBI Datasets (aka semantic models). In the Azure Data Warehouse world, we have a Technical User that refreshes the Dataset hourly or daily. But how do you refresh a dataset which is based on a Lakehouse? You youe the Databricks connector in PBI?

    • @stephanieamrivera
      @stephanieamrivera 5 месяцев назад

      I have asked Hobbs to reply :)

    • @BricksBI
      @BricksBI 5 месяцев назад

      Hi @Pixelements. If you're using an Import approach, you will set a refresh schedule in the Power BI Service and your model will then refresh itself as often as the schedule dictates. If you're using DirectQuery, each time any given report is opened, it re-runs the query its based on and retrieves the results, so there's no need to set a refresh schedule there. You can also turn on a report setting in DirectQuery reports that says "once the report is open, go ahead and re-run your query every X minutes." In either case, your PBI Semantic Model (previously known as PBI Datasets) is using whatever connector you used when you made it to reach from the Power BI Service to Databricks and retrieve new data.

  • @pcp21599
    @pcp21599 6 месяцев назад

    Great work, went through your playlist and its content is awesome. :)

  • @arvind1cool
    @arvind1cool 6 месяцев назад

    Simply awesome. crystal clear

  • @rickrofe4382
    @rickrofe4382 6 месяцев назад

    Nice demo, you want a job! Keep up the great work.

  • @lucaslira5
    @lucaslira5 7 месяцев назад

    Using it as a batch and merging it in forechBatch. Should I create the table with the delta location before processing? I say for tables arriving every day

    • @stephanieamrivera
      @stephanieamrivera 5 месяцев назад

      I asked Robert to reply :)

    • @robertmosley4577
      @robertmosley4577 5 месяцев назад

      It depends on how much control you want. I have some customers that explicitly create every table before loading into it, but that's not necessary. You can create it ad-hoc at the time it's loaded. Chances are, many columns will be inferred as strings, so you may find that you want to specifically create the table before you begin loading into it.

    • @lucaslira5
      @lucaslira5 5 месяцев назад

      Thank you@@robertmosley4577

    • @lucaslira5
      @lucaslira5 5 месяцев назад

      Thank you@@stephanieamrivera

  • @majdi_saadani
    @majdi_saadani 7 месяцев назад

    Hello Stephanie, Thank you for the video, it is interesting to see how we can include airflow in databricks and manage jobs externally. My question is: is there a benefits to use airflow in databricks to schedule jobs instead of using workflows directly in databricks UI?

    • @stephanieamrivera
      @stephanieamrivera 5 месяцев назад

      Thanks for the question. Not really. I see customers use workflows unless their company already uses airflow.

  • @maksymbelko
    @maksymbelko 7 месяцев назад

    Awesome, glad to find so useful content before migration

  • @allthingsdata
    @allthingsdata 8 месяцев назад

    Excellent session and material. Thanks a lot!

  • @user-qi6lh1vo9z
    @user-qi6lh1vo9z 8 месяцев назад

    can you provide all connectivity of vpc as such which subnets is connected to which route table and to which endpoint?

    • @stephanieamrivera
      @stephanieamrivera 5 месяцев назад

      Will get back to you shortly on this!

    • @stephanieamrivera
      @stephanieamrivera 5 месяцев назад

      The connectivity of the VPC will depend from deployment to deployment. In this case, there is a private subnets with route tables to an S3 gateway endpoint, the local VPC CIDR, and 0.0.0.0/0 to a NAT gateway. The public subnet then routes all traffic to an internet gateway. The traffic from the EC2 instance to the PrivateLink endpoint for Databricks is covered in the local VPC CIDR range route table entry. The traffic finds it's way to the endpoint using DNS resolution. Hope this helps!

    • @OPopoola
      @OPopoola Месяц назад

      ​@@stephanieamrivera It doesn't. The training is vague in some areas. What I would like to see is is an explanation of: (1) sample NACL for the traffic into the subnet. (2) sample security group that can be attached to the cross account IAM role that would work for private link purposes (3) configuring access for access to s3 for example and other IGW public services. I have been battling with this for 4 days now. I need a resolution asap. Documentation is taking me all over the place. There are members of my team that are calling for other platforms but I am adamant that this is the best platform for our purposes. Thanks.

    • @stephaniedatabricksrivera
      @stephaniedatabricksrivera 19 дней назад

      @@OPopoola Please reach out to your Databricks team for more details. NACLs remain standard, they should be unchanged. This is standard AWS networking. S3 Gateway Endpoint for in-region buckets, NAT Gateway to Internet Gateway for public services, with a WAF if needed.

  • @TimFrazer-xe9dd
    @TimFrazer-xe9dd 8 месяцев назад

    Thanks for sharing. When creating an external connection to Azure SQL, what authentication methods are supported? Can we use a Service Principal or is it limited to SQL Authentication?

    • @stephanieamrivera
      @stephanieamrivera 5 месяцев назад

      There are 2 different authentication methods that can be used: 1. SQL Authentication: This method involves providing a username and password to authenticate against the Azure SQL Server. It requires a login and password that are configured on the Azure SQL Server. 2. Azure Active Directory (AD) authentication: Azure SQL supports using Azure AD identities to authenticate and authorize database access. This method enables you to use Azure AD accounts or groups to authenticate and manage access to your Azure SQL Database or Azure Synapse Analytics. Azure SQL currently does not directly support authenticating with Service Principals. However, you can use Azure AD credentials associated with a Service Principal to authenticate and access Azure SQL by creating a SQL login mapped to the Service Principal's Azure AD identity.

  • @vonmoraes
    @vonmoraes 8 месяцев назад

    There is some video of conecting and using this in dataflow? i mean an hands on video haha

    • @stephanieamrivera
      @stephanieamrivera 8 месяцев назад

      You mean connecting Databricks to dataflow?

  • @bolbol8043
    @bolbol8043 8 месяцев назад

    Thanks for sharing. Can I access the logs from databricks directly to cloudwatch/trial without a s3 bucket?

  • @jimthorstad6226
    @jimthorstad6226 8 месяцев назад

    Please note - as of October 2023 the Unity Catalog metastore no longer requires an access connector and storage root path, greatly simplifying the setup! So you can disregard the steps and discussion at these points in the video: time=7:18, 8:28, and 33:40. The rest of the video is still super helpful to understand how to setup your Catalogs, Storage Credentials, External Locations, and more. Note that the steps to make an access connector (time=27:15) still apply to creating your dev/test/prod Catalog managed storage locations. Hope to create an updated video reflecting these simplifications sometime soon!

  • @user-lr3sm3xj8f
    @user-lr3sm3xj8f 8 месяцев назад

    Thank you Stephanie and Ashley. When using this at scale with a team, do we have a DAB folder per project (multiple databricks.yml)? If so, what is the best way to structure the repo for github actions to init all of those bundles?

  • @niting123
    @niting123 9 месяцев назад

    Great resource for UC migration. Thanks for sharing.

  • @Learn2Share786
    @Learn2Share786 9 месяцев назад

    Can you share the slides?

  • @Learn2Share786
    @Learn2Share786 9 месяцев назад

    Thanks, in the executor i see 2 blocks - 1) Memory 2 Local Disk. I understand Memory size is determined when selecting the size of worker type from UI drop down but howz "local disk" size being determined ?

    • @stephanieamrivera
      @stephanieamrivera 5 месяцев назад

      The local disk size in the executor is determined by the instance type of the worker nodes in your Databricks cluster. The size of the local disk is a characteristic of the instance type and is fixed for each instance. Local disk is used for storing temporary data, intermediate results, and executor-related files during the processing of jobs and tasks.

  • @abisheksubramanian8069
    @abisheksubramanian8069 9 месяцев назад

    Grand session 👍

  • @SanjaySingh-gj2kq
    @SanjaySingh-gj2kq 9 месяцев назад

    Very informative presentation. Thanks for sharing the video and slides.

    • @stephanieamrivera
      @stephanieamrivera 8 месяцев назад

      Our pleasure!

    • @theharshavinash
      @theharshavinash 6 месяцев назад

      @@stephanieamrivera Hey, would it be possible to get the code too?

  • @toinetteborchelt1807
    @toinetteborchelt1807 9 месяцев назад

    ✋ Promo_SM

  • @deenadayalan927
    @deenadayalan927 9 месяцев назад

    How can I join the live session?

  • @berkerkozan3659
    @berkerkozan3659 9 месяцев назад

    amazing.. super to the point, would be great to see also control plane public access part!

    • @stephanieamrivera
      @stephanieamrivera 8 месяцев назад

      Glad you enjoyed it! I will ask JD if he can respond :)