Get Data Into Databricks - Simple ETL Pipeline

Поделиться
HTML-код
  • Опубликовано: 23 янв 2025
  • НаукаНаука

Комментарии • 23

  • @nicky_rads
    @nicky_rads 2 года назад +14

    Solid demo for an intro to data engineering !

  • @sumantra_sarkar
    @sumantra_sarkar 10 месяцев назад +2

    Thanks for the demo. Do you all have a link to the slide deck and the data set please?

  • @LearnWithDummy
    @LearnWithDummy 8 месяцев назад

    strange, 1/ Bronze: Loading data from blob storage , and path is from S3? am i missing something here?

  • @rabish86
    @rabish86 2 года назад +11

    Can u provide us the data file or source for practice shown in this video?

  • @ongbak6500
    @ongbak6500 2 года назад +8

    Hi, where I can get this code that you are showing here?

  • @julius8183
    @julius8183 9 месяцев назад

    Very clear and quick tutorial. Well done, thanks!

  • @esteban-alvino
    @esteban-alvino 6 месяцев назад +1

    Hello for the video, it could't follow it up, because of the juniper notebook, what do you recommend me to follow in order to replicate what you did in this vidoe. Thank you.

  • @TheDataArchitect
    @TheDataArchitect Год назад

    You have not append any meta data with the bronze layer, like when it was ingested, which file is the source of it?
    bronze layer should have all historical data, no?
    and what should be done next at the silver layer, so that only unprocessed data is processed to the silver table?

  • @UntouchedPerspectives
    @UntouchedPerspectives Год назад

    What about on prem data and iot data? Does DBX has ingestion capabilities?

  • @mikevladi
    @mikevladi 2 месяца назад

    sha1 creates a hash (secure in its name is a misnomer -- it is just longer than an MD5 hash), which should not be considered a replacement for encryption. Python supports the HMAC algorithm through its hmac module, which allows you to mix in a secret key for encryption-like security.
    Otherwise the presentation is thorough. Thank you!

  • @dhruvpathi941
    @dhruvpathi941 Год назад +3

    where can i find this notebook ?

  • @madhusudhan4484
    @madhusudhan4484 3 месяца назад

    Thanks for the demo

  • @valentinloghin4004
    @valentinloghin4004 Месяц назад

    Super nice !! Can you provide the datasets !?

  • @omer_f_ist
    @omer_f_ist 2 года назад

    In the video orders/spend information data is exported as csv files. Should source OLTP systems export data? Is it more practical than the other methods(jdbc, etc...) ?

  • @bharatia4826
    @bharatia4826 4 месяца назад

    Informative Great Demo! Many thanks

  • @DurandKwok
    @DurandKwok Год назад +1

    Nice. Is the notebook available to download and try?

  • @rendorHaevyn
    @rendorHaevyn 2 года назад +1

    Great demo

  • @thanhphamac5433
    @thanhphamac5433 Месяц назад

    Really helpful

  • @7effrey
    @7effrey 2 года назад +1

    Is this the recommended way of doing ETL with databricks? I thought delta live tables where the recommended approach now

    • @uditranjan2432
      @uditranjan2432 2 года назад +5

      This is one of the ways to build a simple pipeline with Databricks - how one can easily get data from cloud storage and apply some transformations on it. Delta Live Tables (DLT) is the recommended approach for modern ETL/more complex workflows. We will publish an explainer video on DLT soon.

  • @vaddadisanthoshkumar4143
    @vaddadisanthoshkumar4143 2 года назад

    Thank you. 🙏

  • @borrarao1525
    @borrarao1525 Год назад

    Good

  • @peterko8871
    @peterko8871 10 месяцев назад

    So what is the challenge here, because this is like a 12 year old person can set up, basically just organizing some tasks in sequential order.