Advancing Spark - The Hidden Databricks Dashboard Tool!

Advancing Spark - Managing Files with Unity Catalog Volumes

Dynamic Databricks Workflows - Advancing Spark

Lil Tecca - BAD TIME (Official Music Video)

This Could Be The Most Shocking, DRAMATIC & Devastating Bachelorette Finale In Years

China's consul general in New York leaves U.S. following scandal

Advancing Spark - Getting hands-on with Delta Cloning

Advancing Analytics

Просмотров 2,8 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 7 сен 2024
Last week we looked at the announcements for Databricks Runtime 7.2 and got all excited about the notes for Delta Cloning - but we had some really good questions raised about exactly what happens under the hood. So this week join Simon as he takes a bit of a dive into DEEP and SHALLOW cloning with Delta on Databricks.
For more info on the Clone functionality and the other syntax available, take a look at the notes here: docs.databrick...
As always, for more tasty blogs, or info about our hands-on training courses, come visit us at: www.advancinga...

Комментарии • 13

@lucian1511 Год назад ⁺¹
Very nice video! Keep up the good work!
I am in the process of learning and your videos are excellent. I can only hope you will continue to upload new interesting stuff.
Thank you!
@tanushreenagar3116 11 месяцев назад
nice sir
@the.activist.nightingale 4 года назад ⁺¹
Nice video -yet- again Simon!
I really appreciate how you take the time to show all the manipulations and even the bugs ;)
Seems like a cool feature but I'm wondering how it would fare if I am cloning a huge table 70-140M of rows? Maybe some stress-test would be needed on my side :)
On the light side, please don't zoom on your face too often I get mesmerized by your eyes (are they blue-green) and I need to replay the parts multiple times :D HAHAHAHA#GirlProblems
@AdvancingAnalytics 4 года назад
Hey! Glad the videos are still useful!
So the shallow clone for ~140m rows will be a couple of seconds, as it's just a bit of metadata. The deep clone will depend on your cluster, but that's not a huge amount of data for spark, you could easily have it cloned in between 5-20mins depending on the size of the cluster!
Simon
@nikkaz5639 Год назад
Hey Simon, thanks a lot for this video. A question: how would you then make live the clone version to become the original one? Thanks
@AdvancingAnalytics Год назад
Hrm, not sure that's possible - unless you update all files within the delta table, it will still be pointing to some files from the original! I'd say to treat clones like temporary entities, then re-do the operation if you want to make it permanent?
@bhaveshpatelaus 4 года назад
Thanks Simon. I can see the use case of this in DR scenario where primary and secondary regions in ADLS or Blob is doing asynochrnous copy of data and thus make delta tables corrupted! Does DEEP CLONE happens with ACID guarantees. What if you are CLONING big tables and there is an interrpution to the cloning operation. Does it land incomplete data?
@prashanthxavierchinnappa9457 3 года назад
Hey Simon Thanks for a great video. Just the kind of channel I was looking for. A quick question I am wondering what is the best way to copy only certain partitions of a delta table and create a new delta table without having to copy all the contents. I assumed cloning would help somehow, but does not seem the case.
@AdvancingAnalytics 3 года назад ⁺¹
Afraid cloning doesn't support partition-scoping that I know of. You would likely need to write a quick dataframe that reads your source, filters to your desired partitions and writes to the new table - you wouldn't get table settings, transaction history etc copied across though! There are some workarounds with cloning, deleting partitions etc, but it'll be more work than just writing a quick dataframe!
@sid0000009 4 года назад
Shallow Clone : What happens to the cloned table if we update on the original table. As we understand the initial pointer of the cloned table is towards the original table data. Thanks
@AdvancingAnalytics 4 года назад ⁺²
So the original table will see the new files as "replaced" in the trans log. The cloned table will point at the old files and work as expected. The only problem will come if you run a Vacuum on the original table after updating, then the shallow clone will no longer function. So not great for long-term, but fantastic for short term testing/experimentation!
Simon
@sid0000009 4 года назад
@@AdvancingAnalytics ur a genius!
@nishu2u85 2 года назад
Thanks much for clarifying :)

Следующие

Автовоспроизведение

Advancing Spark - The Hidden Databricks Dashboard Tool!

Advancing Spark - The Hidden Databricks Dashboard Tool!

Advancing Spark - Managing Files with Unity Catalog Volumes

Advancing Spark - Managing Files with Unity Catalog Volumes

Dynamic Databricks Workflows - Advancing Spark

Dynamic Databricks Workflows - Advancing Spark

Lil Tecca - BAD TIME (Official Music Video)

Lil Tecca - BAD TIME (Official Music Video)

This Could Be The Most Shocking, DRAMATIC & Devastating Bachelorette Finale In Years

This Could Be The Most Shocking, DRAMATIC & Devastating Bachelorette Finale In Years

China's consul general in New York leaves U.S. following scandal

China's consul general in New York leaves U.S. following scandal

Gregg Bissonette Hears System Of A Down For The First Time

Gregg Bissonette Hears System Of A Down For The First Time

Delta Table - Clone

Delta Table - Clone

Ingesting data into Lakehouse with COPY INTO

Ingesting data into Lakehouse with COPY INTO

Delta Live Tables A to Z: Best Practices for Modern Data Pipelines

Delta Live Tables A to Z: Best Practices for Modern Data Pipelines

Advancing Spark - Databricks SQL Variables & Dynamic WHERE

Advancing Spark - Databricks SQL Variables & Dynamic WHERE

Shallow and Deep Clone in Delta Table using Databricks

Shallow and Deep Clone in Delta Table using Databricks

Advancing Spark - Data + AI Summit 2024 Key Announcements

Advancing Spark - Data + AI Summit 2024 Key Announcements

Интеллект ИЗ НИЧЕГО. РАЗУМ РОЯ - ТОПЛЕС

Интеллект ИЗ НИЧЕГО. РАЗУМ РОЯ — ТОПЛЕС

Хабиб на эмоциях чуть не задушил брата Умара

Хабиб на эмоциях чуть не задушил брата Умара

OG Buda - Сабака (A.D.H.D)

OG Buda — Сабака (A.D.H.D)

БЕЛКА РОЖАЕТ?#cat

БЕЛКА РОЖАЕТ?#cat

Will A Guitar Boat Hold My Weight?

Will A Guitar Boat Hold My Weight?

НИВА - ЛЕГЕНДА РУССКОГО АВТОПРОМА / РАЗГОН

НИВА - ЛЕГЕНДА РУССКОГО АВТОПРОМА / РАЗГОН

Наполни аквариум водой и получи мак меню (2024)

Наполни аквариум водой и получи мак меню (2024)

📱🪢 Mom's Wild Lesson: Phone Tied to Thread! See What Happens Next! 😱 #reaction #cats #funny #prank

📱🪢 Mom's Wild Lesson: Phone Tied to Thread! See What Happens Next! 😱 #reaction #cats #funny #prank