Видео 185
Просмотров 114 174

The Importance of Data Modeling in Decision Making & System Design | Presentation by Seda Kocak

22:02

Change Data Capture to Apache Iceberg | Presentation by Santona Tuli Ph.D.

38:31

Analytical Data Transformations with Apache Iceberg Materialized Views | Presentation by Jan Kaul

25:52

The Future of Apache Iceberg & Navigating it with Polaris | Presentation by JB Onofré

34:19

What Does the Future Hold for Apache Iceberg? | Experts Panel in San Francisco

53:01

Apache Iceberg and Building in the Open | Presentation by Holden Karau

23:47

The Future of Apache Iceberg | Experts Panel in London

Moderated by Santona Tuli, Ph.D., this conversation dives deep into the trends, challenges, and innovations shaping the next phase of Iceberg’s evolution. From metadata management to real-world use cases, this panel covers it all, offering a unique blend of technical expertise and forward-looking insights.
Recorded live at the Chill Data Summit in London, our Panelists included:
🔵 Ryan Dolley, Vice President of Product Strategy at GoodData
🔵 Chris Tabb, Co-Founder & CCO at Leit Data
🔵 Hugo Lu, Founder at Orchestra
🔵 Yoni Eini, CTO & Co-Founder at Upsolver
🔵 JB Onofré, Board Member of the Apache Software Foundation & Principal Software Engineer at Dremio
Whether you're an engineer, data architect...

Видео

The Importance of Data Modeling in Decision Making & System Design | Presentation by Seda Kocak

22:02

The Importance of Data Modeling in Decision Making & System Design | Presentation by Seda Kocak

Просмотров 2243 месяца назад

In this session from the Chill Data Summit in London, Seda Kocak, Senior Data Analyst at The Dot Collective explores the importance of data modeling when it comes making decisions and designing systems. Seda explains how effective data modeling lays the foundation for sound decision-making and robust system architecture, helping organizations unlock the full potential of their data. Through rea...

Change Data Capture to Apache Iceberg | Presentation by Santona Tuli Ph.D.

38:31

Change Data Capture to Apache Iceberg | Presentation by Santona Tuli Ph.D.

Просмотров 6653 месяца назад

In this engaging presentation from the Chill Data Summit in London, Santona Tuli tackles the topic of Change Data Capture (CDC) in Apache Iceberg. CDC plays a crucial role in real-time data processing, enabling data updates and accurate analytics in dynamic environments. Santona walks us through how Apache Iceberg supports CDC to handle data changes efficiently and reliably, covering key use ca...

Analytical Data Transformations with Apache Iceberg Materialized Views | Presentation by Jan Kaul

25:52

Analytical Data Transformations with Apache Iceberg Materialized Views | Presentation by Jan Kaul

Просмотров 2243 месяца назад

Watch this presentation from Jan Kaul, Founder and CEO at Dashbook, as he takes a deep dive into analytical data transformations with Iceberg materialized views in this informative session recorded live at the Chill Data Summit in London. In this presentation, Jan explores how Apache Iceberg leverages materialized views to simplify and optimize data transformations, providing a seamless experie...

The Future of Apache Iceberg & Navigating it with Polaris | Presentation by JB Onofré

34:19

The Future of Apache Iceberg & Navigating it with Polaris | Presentation by JB Onofré

Просмотров 4673 месяца назад

Watch this keynote from JB Onofré, Principal Software Engineer at Dremio and Member of the Board of Directors of The Apache Software Foundation. JB opened our Chill Data Summit event in London with his presentation on the future of Apache Iceberg and Apache Polaris (Incubating). In this talk, JB explains the components of the data lakehouse, exploring how the query engine, catalog, table format...

What Does the Future Hold for Apache Iceberg? | Experts Panel in San Francisco

53:01

What Does the Future Hold for Apache Iceberg? | Experts Panel in San Francisco

Просмотров 1143 месяца назад

Enjoy the panel discussion from the Chill Data Summit in San Francisco, where leading minds in the data and open-source space debate the future of Apache Iceberg, a cutting-edge solution for managing large-scale data lakes. Moderated by Santona Tuli, Ph.D., this conversation dives into the trends, challenges, and innovations shaping the next phase of Iceberg’s evolution. From metadata managemen...

Apache Iceberg and Building in the Open | Presentation by Holden Karau

23:47

Apache Iceberg and Building in the Open | Presentation by Holden Karau

Просмотров 1293 месяца назад

In this engaging and honest talk, @HoldenKarau-an Open Source Engineer, speaker, author, and Apache Spark Committer-shares her wealth of experience working on open source software (OSS) projects. With a background in contributing to and maintaining some of the most widely used open-source frameworks, Holden offers a unique perspective on the realities of working in the OSS community. In this se...

REST Catalogs in Apache Iceberg | Presentation by Lisa Cao

21:20

REST Catalogs in Apache Iceberg | Presentation by Lisa Cao

Просмотров 5443 месяца назад

In this insightful talk, Lisa N. Cau, a leading contributor to the data community, explores the emerging role of REST catalogs in simplifying data management in an Apache Iceberg lakehouse. With her extensive experience in building scalable data solutions, Lisa delves into how REST-based catalogs offer a flexible and accessible way to interact with metadata, making it easier to manage large-sca...

Apache Iceberg, Arrow, Substrait, and the Inescapable Power of Open | Presentation by Jacques Nadeau

24:56

Apache Iceberg, Arrow, Substrait, and the Inescapable Power of Open | Presentation by Jacques Nadeau

Просмотров 7523 месяца назад

In this presentation, Jacques Nadeau-co-creator of Apache Arrow and a visionary in the open-source data ecosystem-dives deep into the potential of open technologies. Jacques explores how projects like Apache Iceberg, Arrow, and Substrait are reshaping the future of data processing and analytics, starting with a look at Databricks' acquisition of Tabular and where that leads now. With his extens...

Apache Iceberg and the Deconstructed Database | Keynote by Julien Le Dem

23:51

Apache Iceberg and the Deconstructed Database | Keynote by Julien Le Dem

Просмотров 2,5 тыс.3 месяца назад

Watch this keynote from Julien Le Dem, a leading voice in the open-source community, for an insightful view of Apache Iceberg and the concept of the deconstructed database. Julien, known for his contributions to projects like Apache Parquet and his pioneering work in data architecture, explores how Iceberg is transforming data lakes by offering schema evolution, partitioning, and efficient data...

4:50

Change Data Capture (CDC) to Snowflake

Просмотров 1544 месяца назад

Change Data Capture (CDC) to Snowflake

0:15

Ready to learn Apache Iceberg?

Просмотров 1895 месяцев назад

Chill Data Summit on tour is here! Join us in San Francisco, London, New York or Tel Aviv.

Part 13 - Data quality validation - Hive to Iceberg Tables Migration eLearning Module

1:43

Part 13 - Data quality validation - Hive to Iceberg Tables Migration eLearning Module

Просмотров 737 месяцев назад

🛠️ Considering Iceberg Lakehouse ? Book a free consultation with an expert here: www.upsolver.com/discover 🎓 Watch additional Iceberg eLearning modules here: www.upsolver.com/resources/iceberg-academy Description: In this video, we explore the benefits of using Iceberg, including straightforward SQL query comparisons for source and target data, monitoring schema evolution over time, and identif...

Part 12 - Post migration considerations - Hive to Iceberg Tables Migration eLearning Module

3:21

Part 12 - Post migration considerations - Hive to Iceberg Tables Migration eLearning Module

Просмотров 457 месяцев назад

🛠️ Considering Iceberg Lakehouse ? Book a free consultation with an expert here: www.upsolver.com/discover 🎓 Watch additional Iceberg eLearning modules here: www.upsolver.com/resources/iceberg-academy Description: After migrating from Hive to Iceberg, it's crucial to monitor key metrics to ensure smooth operations. Check the ingestion rate for any drops or conflicts, as Iceberg's snapshot-orien...

Part 11 - Choosing the ideal strategy for you - Hive to Iceberg Tables Migration eLearning Module

1:37

Part 11 - Choosing the ideal strategy for you - Hive to Iceberg Tables Migration eLearning Module

Просмотров 567 месяцев назад

🛠️ Considering Iceberg Lakehouse ? Book a free consultation with an expert here: www.upsolver.com/discover 🎓 Watch additional Iceberg eLearning modules here: www.upsolver.com/resources/iceberg-academy Description: In this video, we summarize the various strategies for migrating from Apache Hive to Apache Iceberg, breaking down the options into manageable steps. From in-place snapshot migration,...

Part 10 - Selective migration - Hive to Iceberg Tables Migration eLearning Module

1:32

Part 10 - Selective migration - Hive to Iceberg Tables Migration eLearning Module

Просмотров 517 месяцев назад

Part 10 - Selective migration - Hive to Iceberg Tables Migration eLearning Module

Part 9 - Mirror migration - Hive to Iceberg Tables Migration eLearning Module

4:57

Part 9 - Mirror migration - Hive to Iceberg Tables Migration eLearning Module

Просмотров 927 месяцев назад

Part 9 - Mirror migration - Hive to Iceberg Tables Migration eLearning Module

Part 8 - Duplicate migration - Hive to Iceberg Tables Migration eLearning Module

3:14

Part 8 - Duplicate migration - Hive to Iceberg Tables Migration eLearning Module

Просмотров 847 месяцев назад

Part 8 - Duplicate migration - Hive to Iceberg Tables Migration eLearning Module

Part 7 - In Place (metadata only) migration - Hive to Iceberg Tables Migration eLearning Module

5:46

Part 7 - In Place (metadata only) migration - Hive to Iceberg Tables Migration eLearning Module

Просмотров 1427 месяцев назад

Part 7 - In Place (metadata only) migration - Hive to Iceberg Tables Migration eLearning Module

Part 6 - Migration strategies - Hive to Iceberg Tables Migration eLearning Module

2:06

Part 6 - Migration strategies - Hive to Iceberg Tables Migration eLearning Module

Просмотров 767 месяцев назад

Part 6 - Migration strategies - Hive to Iceberg Tables Migration eLearning Module

Part 5 - Migration considerations - Hive to Iceberg Tables Migration eLearning Module

4:57

Part 5 - Migration considerations - Hive to Iceberg Tables Migration eLearning Module

Просмотров 767 месяцев назад

Part 5 - Migration considerations - Hive to Iceberg Tables Migration eLearning Module

Part 4 - The Iceberg difference - Hive to Iceberg Tables Migration eLearning Module

5:35

Part 4 - The Iceberg difference - Hive to Iceberg Tables Migration eLearning Module

Просмотров 1237 месяцев назад

Part 4 - The Iceberg difference - Hive to Iceberg Tables Migration eLearning Module

Part 3 - Challenges with Hive based data lakes - Hive to Iceberg Tables Migration eLearning Module

3:40

Part 3 - Challenges with Hive based data lakes - Hive to Iceberg Tables Migration eLearning Module

Просмотров 1157 месяцев назад

Part 3 - Challenges with Hive based data lakes - Hive to Iceberg Tables Migration eLearning Module

Part 2 - Why migrate to Iceberg - Hive to Iceberg Tables Migration eLearning Module

5:13

Part 2 - Why migrate to Iceberg - Hive to Iceberg Tables Migration eLearning Module

Просмотров 3097 месяцев назад

Part 2 - Why migrate to Iceberg - Hive to Iceberg Tables Migration eLearning Module

Part 1 - Intro - Hive to Iceberg Tables Migration eLearning Module

2:19

Part 1 - Intro - Hive to Iceberg Tables Migration eLearning Module

Просмотров 1587 месяцев назад

Part 1 - Intro - Hive to Iceberg Tables Migration eLearning Module

Part 14 - Testing - Hive to Iceberg Tables Migration eLearning Module

2:52

Part 14 - Testing - Hive to Iceberg Tables Migration eLearning Module

Просмотров 387 месяцев назад

Part 14 - Testing - Hive to Iceberg Tables Migration eLearning Module

Part 9 - Iceberg Table Services - Building Iceberg Lakehouse With Spark - eLearning Module

8:00

Part 9 - Iceberg Table Services - Building Iceberg Lakehouse With Spark - eLearning Module

Просмотров 788 месяцев назад

Part 9 - Iceberg Table Services - Building Iceberg Lakehouse With Spark - eLearning Module

Part 8 - Optimistic Concurrency - Building Iceberg Lakehouse With Spark - eLearning Module

8:55

Part 8 - Optimistic Concurrency - Building Iceberg Lakehouse With Spark - eLearning Module

Просмотров 958 месяцев назад

Part 8 - Optimistic Concurrency - Building Iceberg Lakehouse With Spark - eLearning Module

Part 7 - Deleting Rows - Building Iceberg Lakehouse With Spark and Upsolver - eLearning Module

4:07

Part 7 - Deleting Rows - Building Iceberg Lakehouse With Spark and Upsolver - eLearning Module

Просмотров 1058 месяцев назад

Part 7 - Deleting Rows - Building Iceberg Lakehouse With Spark and Upsolver - eLearning Module

Part 6 - Understanding How Manifests Work (Create/Insert) - Building Iceberg Lakehouse With Spark

6:39

Part 6 - Understanding How Manifests Work (Create/Insert) - Building Iceberg Lakehouse With Spark

Просмотров 768 месяцев назад

Part 6 - Understanding How Manifests Work (Create/Insert) - Building Iceberg Lakehouse With Spark

@thomasgremm6127 13 дней назад
Great!
@alexmiro4161 13 дней назад
I find this playlist so good, useful and concise! Can't believe there are so few views and likes.
@rorycawley Месяц назад
Could you post the slides?
@RuairiODonnellFOTO Месяц назад
So which rest catalog is the best choice for an Enterprise?
@Studiful 4 месяца назад
There are a whole lot of assumptions made here that, in my experience, just are not true. Eg. processing is almost always much more expensive than storage, so no cost savings there by saving on space at the expense of processing. Also, the warehouse is not a business model at all. Seems this is from a very narrow perspective of someone who focuses mainly on data science, rather than on data engineering. There is huge value in having a centralised location for data where the definitions are the same, the data is enriched from many sources into a cohesive and unified view of the data, much less duplication of data (so the marginal cost savings of space in this presentation are nullified anyway). The main reason that this approach is popular, at the moment, is that people don't have to wait so long to access the data and they can run off an do their own empire building in a silo. This is great for the individual, but not great at all for the business as a whole. Data scientists are used to mass duplication of data and massive processing costs in building their models, so this is normal for them. There is big trouble ahead for those who are designing everything around the very specific needs of machine learning. That is only a small part of the whole.
@AndreSilva-oy5kv 4 месяца назад
I have a pyspark glue job in aws, that is charge of compacting my iceberg table, it is using iceberg procedure to compact the table. It's job has been taking more than 4 hours running, i have been getting timeout, is there any efficient way to compact a table without spark ?
@jeanchindeko5477 6 месяцев назад
This conversation is totally biased because you guys are taking assumptions Iceberg is the de facto standard. And this is a narrative the Iceberg community is pushing in total disregard for what happening: - AWS support for Delta and Hudi was released in Redshift Spectrum before the support for Iceberg. When Iceberg support was added last year it was in read only. - Snowflake support Iceberg in read/Write and Delta in read only. - GCP BigQuery support both Iceberg and Delta - Oracle support Delta and Delta Sharing - Salesforce doesn’t only support Iceberg but also Delta and have Zero ETL integration with the Databricks platform. - Microsoft Fabric is built on top of Delta - OneHouse which is contributing a lot on Hudi released the Apache XTable to handle this interoperability between the 3 tables format and they’re working with Databricks on the Delta UniForm capability - and the list goes on… Databricks have more open track records than Snowflake, who only recently started to open up to the OSS community as a desperate move to stop the customer moving to Databricks. Delta with the Delta Kernel makes it easier to integrate with the Delta spec and support in a more uniform way Delta capabilities. In Iceberg one query engine might support version 1 and some capabilities of Version 2, or another only partially supporting version 2, which bring is set of challenges for customers! Iceberg won on the community side against the 2 other formats, and have used that to gain more momentum. But features wise, there are more innovations happening on the 2 other formats, than on Iceberg. Iceberg has some great features, but is that plus having a bigger community enough to claim it’s the de facto standard? Not sure about that. It might be time for the Iceberg community to stop playing politics and works with the 2 other communities to end this table format war, which will be the best way moving forward to have total interoperability. Do you guys remember the time when different cellular operator was using different technology, and peoples was having different phones to be able to contact friends, family and colleagues. Then when GSM was adopted, how easy it became moving from operator or from one state to another or traveling abroad? Now nobody thinks about it and we’re all using GSM world wide. Why can we have the same in data engineering?
@williamchurch711 7 месяцев назад
Newbie here! Would a catalog be equivalent to the database schema?
@upsolver8921 6 месяцев назад
Great question! A catalog is a wrapper around the Iceberg REST API which allows us to make commits to an Iceberg lakehouse. Iceberg comes with a REST catalog, and there are proprietary extensions to this catalog that can enrich the lakehouse experience with additional features such as data tag management, data governance through RBAC etc. And yes, part of the catalog's job is to expose tables in the lakehouse, along with their schema and statistics, to users.
@tratkotratkov126 7 месяцев назад
Other than the Iceberg open data format I was unable to notice anything else I could reference as “open” from the suggested solution architecture.
@ShawnGordon 7 месяцев назад
Well, Iceberg is a spec, not a piece of software. It existed before Tabular and will exist after the acquisition. I'm really scratching my head about its long-term goal. I also wonder what Tabular customers have been told and if that product will continue to exist.
@guilhermecabral9391 8 месяцев назад
pls bro fix your mic, btw good video
@antumukherjee5975 10 месяцев назад
Could you please show the reverse process, i.e., from athena to kafka cluster.
@upsolver8921 10 месяцев назад
We do not currently support Kafka as a target for ingestion/transformation jobs however we are happy to consider it for a future release. Would you be open to discussing your use case with us?
@rachel-upsolver 11 месяцев назад
Great to hear these amazing announcements from Ori and Yoni about Upsolver's new features to support Apache Iceberg, announced at the Chill Data Summit NYC. What a brilliant first event 🎉
@shivamkumarrai638 2 года назад
should we create real-time streaming data solutions?
@irshviralvideo 2 года назад
How is this different than snowflake data warehouse and sql?
@alassanesall8971 2 года назад
Hello can we have the datasets??
@kavajvlogs1457 4 года назад
I was exploring this tool and this tools looks good. But, there is no sufficient materials online. I think more quality videos are needed. A good demo with various examples are mandatory.

Upsolver

Комментарии