Open Data Foundations across Hudi, Iceberg and Delta

Apache Iceberg Overview (Jan 2024 Edition) - Architecture, Ecosystem, and more!

Session 2: The End Game in Business

AMAD WORLD CLASS! MAN CITY 1-2 MAN UTD GOLDBRIDGE MATCH REACTION

Is WESTERN Or EASTERN Dragon Better in Blox Fruits?! (Which YOU Should Choose!)

NEW DRAGON HUNTER NPC FULL GUIDE | DRAGON HEART QUEST? | Blox Fruits...

The future of Delta Lake and Apache Iceberg with Tathagata Das

NextGenLakehouse

Просмотров 1,3 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 5 фев 2025
Software Engineer: Tathagata Das is a Staff Software Engineer at Databricks, an Apache Spark committer, and a member of the Project Management Committee (PMC) for Apache Spark.
Apache Spark: He is the lead developer behind Spark Streaming and has contributed significantly to the development of Structured Streaming.
Delta Lake: Das is a core developer of Delta Lake and a committer to the project.
Research: He has conducted research on data-center processing frameworks and networks at the University of California, Berkeley, and has published several papers on these topics.
Author: Das is one of the authors of "Learning Spark: Lighting-fast Data Analytics" (2nd edition)
NextGenLakehouse Newsletter
#Databricks #DeltaLake #Delta #UnityCatalog #ETL #DataEngineering
Databricks DeltaLake Delta UnityCatalog ETL Data Engineering

Комментарии • 3

@utsavchanda4190 6 месяцев назад
Really insightful discussion. Thank you for that. Honestly, I've always wondered whether these lakehouses built on open format tables can guarantee the same performance as MPP warehouses. And the biggest reason for that concern has been how in delta every operation (insert, update or delete) is essentially an insert (new file) under the hood. And then there are other considerations like small file problems and optimized writes. And always felt there was a significant development/operational overhead in terms of running OPTIMIZE, Z-ORDER and now enabling DELETING VECTORS in order to keep the tables performant as they grow. Does LIQUID CLUSTERING take that overhead away from customers and make their life easier? I know Databricks promises intelligent optimization and automatic clustering for managed tables but what about external tables because most companies would be having external tables where the underlying files are in their realm.
@jeanchindeko5477 6 месяцев назад
Yes Liquid Clustering is a good starting point and moving thing in the right direction in terms of user/developer experience. But Liquid Clustering might not solve all the problems, but will already help with the part of your small files concern.
@jeanchindeko5477 6 месяцев назад
Why did Apple switch to Apache Iceberg?

Следующие

Автовоспроизведение

Open Data Foundations across Hudi, Iceberg and Delta

Open Data Foundations across Hudi, Iceberg and Delta

Apache Iceberg Overview (Jan 2024 Edition) - Architecture, Ecosystem, and more!

Apache Iceberg Overview (Jan 2024 Edition) - Architecture, Ecosystem, and more!

Session 2: The End Game in Business

Session 2: The End Game in Business

AMAD WORLD CLASS! MAN CITY 1-2 MAN UTD GOLDBRIDGE MATCH REACTION

AMAD WORLD CLASS! MAN CITY 1-2 MAN UTD GOLDBRIDGE MATCH REACTION

Is WESTERN Or EASTERN Dragon Better in Blox Fruits?! (Which YOU Should Choose!)

Is WESTERN Or EASTERN Dragon Better in Blox Fruits?! (Which YOU Should Choose!)

NEW DRAGON HUNTER NPC FULL GUIDE | DRAGON HEART QUEST? | Blox Fruits...

NEW DRAGON HUNTER NPC FULL GUIDE | DRAGON HEART QUEST? | Blox Fruits...

Jarahn - On My Way (Official Music Video) Jarahn feat. Studd Cruiser x Yansa Q

Jarahn - On My Way (Official Music Video) Jarahn feat. Studd Cruiser x Yansa Q

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

AI tools for software engineers, but without the hype - with Simon Willison (Co-Creator of Django)

AI tools for software engineers, but without the hype – with Simon Willison (Co-Creator of Django)

Electricity ⚡ Dark 🤐 Secret - The Hidden Journey of Energy 🌍✨

Electricity ⚡ Dark 🤐 Secret - The Hidden Journey of Energy 🌍✨

What is Apache Iceberg?

What is Apache Iceberg?

How To Build An Open Data Lakehouse On Snowflake With Apache Iceberg

How To Build An Open Data Lakehouse On Snowflake With Apache Iceberg

From Berkeley to a $B company Data+AI Startup - Lesson learn with Arsalan Tavakoli

From Berkeley to a $B company Data+AI Startup - Lesson learn with Arsalan Tavakoli

Why you should start using Unity Catalog

Why you should start using Unity Catalog

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Future of Data Management with Apache Iceberg

Future of Data Management with Apache Iceberg

I Got Diabetes…💔 (Giant Chocolates ASMR)

I Got Diabetes…💔 (Giant Chocolates ASMR)

Живу 24 Часа Как Дональд Трамп #трамп #челлендж #серый #24часа

Живу 24 Часа Как Дональд Трамп #трамп #челлендж #серый #24часа

Как грузин обдурил СССР на десятки миллионов, используя хитрую схему

Как грузин обдурил СССР на десятки миллионов, используя хитрую схему

Необычный румтур 🤭 #сашаспилберг

Необычный румтур 🤭 #сашаспилберг

ЭТОТ БЭБИ-ГАРГАНТЮА ПРОСТО ПРЕЛЕСТЬ! / SUBNAUTICA

ЭТОТ БЭБИ-ГАРГАНТЮА ПРОСТО ПРЕЛЕСТЬ! / SUBNAUTICA

НАШЕЛ ТАЛАНТ НА УЛИЦЕ #кино #спб #юмор #актриса #вау

НАШЕЛ ТАЛАНТ НА УЛИЦЕ #кино #спб #юмор #актриса #вау

как обнять себя на фото? #туториал

как обнять себя на фото? #туториал

ДРАКА с ПАШЕЙ! Я СОРВАЛ ему СВАДЬБУ! Все закончилось... Расул переобулся!!! (серия 41)

ДРАКА с ПАШЕЙ! Я СОРВАЛ ему СВАДЬБУ! Все закончилось... Расул переобулся!!! (серия 41)