Azure Databricks is Easier Than You Think

Поделиться
HTML-код
  • Опубликовано: 16 сен 2024

Комментарии • 27

  • @mugilkarthikeyan7131
    @mugilkarthikeyan7131 2 года назад +4

    I've never seen anyone explain Azure Databricks as well as you.

  • @vaibhavrana4953
    @vaibhavrana4953 3 года назад +2

    you have explained Spark & Azure Databricks very well. Thank you

  • @pashersil
    @pashersil 2 года назад +1

    Wow you mentioned the SSIS problems and ETL problems I totally relate to .. you have earned cred with me.

  • @harishjulapalli448
    @harishjulapalli448 4 года назад +1

    Great Intro to Databricks and Spark. Thank You.

  • @rmravilla
    @rmravilla 2 года назад

    Thanks for the presentation. It is very useful if one wants to learn Spark & Azure Databricks

  • @josecarlossilva3670
    @josecarlossilva3670 2 года назад +1

    Great content!! It really helped me a lot. Congrats!

  • @chan7354
    @chan7354 3 года назад

    Explanation is very good and for us helped to understand the topic

  • @goselvam
    @goselvam Год назад

    Thanks for the great video. Just wanted to let you know that the slide at 38:59 has the incorrect expansion for DAG, which is shown as Directed Acrylic Graph instead of Directed Acyclic Graph.

  • @svapneel1486
    @svapneel1486 4 года назад +1

    Great Video. Made it very easy to explain

  • @Pravinamadoori
    @Pravinamadoori 2 года назад

    Instructor has given clear demo.. does he have any courses on Udemy?

  • @TheSQLPro
    @TheSQLPro 4 года назад +1

    Excellent video!

  • @gopinathrajee
    @gopinathrajee 2 года назад

    @ 29:50, when you say Azure, you mean the Azure PaaS? And by ExpressRoute, you mean Microsoft Peering? How does a VNet get created on the PaaS though? If it is a VNet does it not fall under Corp Network?

    • @Atmosera-
      @Atmosera-  2 года назад

      Azure PaaS can connect to VNet using private endpoints. ExpressRoute enables on premises connectivity back into the VNet.

  • @mangeshxjoshi
    @mangeshxjoshi 4 года назад +2

    good video and explanation, one question , if on-premise Informatica ETL tool need to migrate to cloud platform . is there any equivalent cloud tool which can replace Informatica ? or can we use Informatica cloud integration on Cloud platform ( as PaaS ) service. How these traditional ETL tool like informatica will replace with cloud infrastructure . i believe , Databricks is for processing power . but i believe we cannot do ETL transformation in databricks . please suggest

    • @PhilipHoyos
      @PhilipHoyos 4 года назад

      You can use databricks for ETL.

    • @anmoltrehan6060
      @anmoltrehan6060 2 года назад

      Azure Data Factory and Databricks will do the trick

    • @shiladityachakraborty9826
      @shiladityachakraborty9826 2 года назад

      We can use IICS and Databricks. Infa BDM will cater to relevant ETL rules and Databricks will be there in order to visualise data. Now having said that it's always better to avail native functionalities of any tool, so if there is no tech upscaling issue it's better to handle etl through pyspark in databricks itself. This is cost effective as well.

  • @Cur8or88
    @Cur8or88 3 года назад +1

    Directed Acrylic Graphs are more durable: 39:08

  • @Shradha_tech
    @Shradha_tech 2 года назад

    Thank you so much for this video 😀

  • @Pravinamadoori
    @Pravinamadoori 2 года назад

    Can someone suggest a good book useful to automate or testing ETL on AWS S3 using databricks?

  • @amjds1341
    @amjds1341 3 года назад

    Great

  • @denwo1982
    @denwo1982 3 года назад

    Hi, I'm coming from a SQl Datbase background and at the moment I am not seeing a benefit of using Azure Databricks? there is nothing stopping me using ADF to pick up a file from ADL gen2, put it into a staging table and then createding a stored proc to do the transformation and then inserting that into a destination table. Or am I missing something here?

    • @Atmosera-
      @Atmosera-  3 года назад

      Databricks is a way of doing ETL on Azure.
      ADF can do a lot, but it's much more limited in its scope. Doing complex transformations in Databricks tends to be easier.
      But if you're using ADF, there's nothing wrong with that.

    • @denwo1982
      @denwo1982 3 года назад

      @@Atmosera- thanks for your reply, do you have any material in regards to deltas? So for example checking a csv file on ADL gen2 for new rows entered within the last hour based on the modified date? Or would it be a case of loading the file into a SQL staging table and compare this to the destination table, find the new rows and then putting that data into another csv file on ADL gen2 folder. Which tool would be more efficiency and cost effective?

    • @Atmosera-
      @Atmosera-  3 года назад

      ​@@denwo1982 I would not use deltas. That is a costly comparison to do. The best thing to do is partition your data in ADL into separate files that are timestamped and load new records that way rather than use a single file.