EP10 - Optimizing Data Files in Apache Iceberg Performance Strategies

Beyond Kafka: Cutting Costs and Complexity with WarpStream and S3

Flink vs Kafka Streams/ksqlDB: Comparing Stream Processing Tools

Spending $8,843,815 On The NEW DOPPELGANGER Ability. (Roblox Blade Ball)

ImDOntai Reacts To The AMP Freshman Cypher 2024

"YOU'VE GOT NO CHANCE DUCK"...DINERS TELL ME I WON'T BEAT THIS BREAKFAST CHALLENGE | BeardMeatsFood

Tame the small files problem and optimize data layout for streaming ingestion to Iceberg

Dremio

Просмотров 2 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 14 мар 2023
In modern data architectures, stream processing engines such as Apache Flink are used to ingest continuous streams of data into data lakes such as Apache Iceberg. Streaming ingestion to Iceberg tables can suffer from two problems: the small files problem that can hurt read performance, and poor data clustering that can make file pruning less effective.
In this session, we will discuss how data teams can address those problems by adding a shuffling stage to the Flink Iceberg streaming writer to intelligently group data via bin packaging or range partition, reduce the number of concurrent files that every task writes, and improve data clustering. We will explain the motivations in detail and dive into the design of the shuffling stage. We will also share the evaluation results that demonstrate the effectiveness of smart shuffling.
Наука

Комментарии •

Следующие

Автовоспроизведение

EP10 - Optimizing Data Files in Apache Iceberg Performance Strategies

EP10 - Optimizing Data Files in Apache Iceberg Performance Strategies

Beyond Kafka: Cutting Costs and Complexity with WarpStream and S3

Beyond Kafka: Cutting Costs and Complexity with WarpStream and S3

Flink vs Kafka Streams/ksqlDB: Comparing Stream Processing Tools

Flink vs Kafka Streams/ksqlDB: Comparing Stream Processing Tools

Spending $8,843,815 On The NEW DOPPELGANGER Ability. (Roblox Blade Ball)

Spending $8,843,815 On The NEW DOPPELGANGER Ability. (Roblox Blade Ball)

ImDOntai Reacts To The AMP Freshman Cypher 2024

ImDOntai Reacts To The AMP Freshman Cypher 2024

"YOU'VE GOT NO CHANCE DUCK"...DINERS TELL ME I WON'T BEAT THIS BREAKFAST CHALLENGE | BeardMeatsFood

"YOU'VE GOT NO CHANCE DUCK"...DINERS TELL ME I WON'T BEAT THIS BREAKFAST CHALLENGE | BeardMeatsFood

My FBI Declassified Story

My FBI Declassified Story

Managing Data Files In Apache Iceberg

Managing Data Files In Apache Iceberg

Building an Open Data Lake House Using Trino and Apache Iceberg

Building an Open Data Lake House Using Trino and Apache Iceberg

Datalake Rock Paper Scissors: Iceberg + Flink or Iceberg + Spark? | Current 2023

Datalake Rock Paper Scissors: Iceberg + Flink or Iceberg + Spark? | Current 2023

Massive Scale Data Processing at Netflix using Flink - Snehal Nagmote & Pallavi Phadnis

Massive Scale Data Processing at Netflix using Flink - Snehal Nagmote & Pallavi Phadnis

Why should you care about DuckDB? ft. Mihai Bojin

Why should you care about DuckDB? ft. Mihai Bojin

Apache Iceberg Merge-On-Read: Streaming CDC - Victoria Bukta, Shopify | Crunch Conference 2022

Apache Iceberg Merge-On-Read: Streaming CDC - Victoria Bukta, Shopify | Crunch Conference 2022

Fast Data Processing with Apache Arrow

Fast Data Processing with Apache Arrow

Apache Iceberg on AWS with S3 and Athena [FULL COURSE IN 30MIN]

Apache Iceberg on AWS with S3 and Athena [FULL COURSE IN 30MIN]

Event-Driven Architecture (EDA) vs Request/Response (RR)

Event-Driven Architecture (EDA) vs Request/Response (RR)

Что такое CSRF атака? #programming #vulnerability #hackingtips #hack #pentesting #shorts #hacker

Что такое CSRF атака? #programming #vulnerability #hackingtips #hack #pentesting #shorts #hacker

Самый необычный антивирус для PC #антивирус #пугало #технологии #хакер #программирование

Самый необычный антивирус для PC #антивирус #пугало #технологии #хакер #программирование

ЗАБЫТЫЙ IPHONE 😳

ЗАБЫТЫЙ IPHONE 😳

НЕ ЛОПАТА в мире ЛОПАТ: Samsung Galaxy S24 за 54K RUB с Алиэкспресс

НЕ ЛОПАТА в мире ЛОПАТ: Samsung Galaxy S24 за 54K RUB с Алиэкспресс

Что делать если в телефон попала вода?

Что делать если в телефон попала вода?

Samsung Galaxy S24 Ultra Vs Google Pixel 8a Boot Test #shorts

Samsung Galaxy S24 Ultra Vs Google Pixel 8a Boot Test #shorts

АЙФОН 20 С ФУНКЦИЕЙ ВИДЕНИЯ ОГНЯ

АЙФОН 20 С ФУНКЦИЕЙ ВИДЕНИЯ ОГНЯ

Самые популярные смартфоны в России

Самые популярные смартфоны в России