Issues in Big Data Projects | Interview Question | 10 Issues Answered

Data Security Strategies in Data Pipelines | Apache Spark | Best Practices

Spark - Repartition Or Coalesce

EA SPORTS FC 25 | Official Gameplay Deep Dive

Bridgerton FINALLY made me CRY (season 3)

EBK Jaaybo - Vulture (Official Music Video)

coalesce vs repartition vs partitionBy in spark | Interview question Explained

GK Codelabs

Просмотров 6 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 25 авг 2021
Hi All,
In this video, I have explained the concepts of coalesce, repartition, and partitionBy in apache spark.
To become a GKCodelabs Extended plan member you can check the below links, and purchase the Big Data end to end pipeline course in your preferred language Python or SCALA
PySpark course available at
courses.gkcodelabs.com/produc...
Spark + SCALA course available at
courses.gkcodelabs.com/produc...
End to End pipeline Introduction Videos:
Pyspark End to End Pipeline
• BIG DATA COMPLETE PROJ...
Spark + Scala End to End Pipeline
• BIG DATA complete PROJ...
Starter Pack available at just: ₹549 (For Indian Payments) or $9 (For non-Indian payments)
Extended Pack available at just: ₹1299 (For Indian Payments) or $19 (For non-Indian payments)
Queries? Write to us at support@gkcodelabs.com
Website: www.gkcodelabs.com In this video I have shared my day-2 experience as a Big Data Engineer and shared with you the usual tasks, assignments, call, and routines in my life as a Big Data engineer.
To become a GKCodelabs Extended plan member you can check the below links, and purchase the Big Data end to end pipeline course in your preferred language Python or SCALA
PySpark course available at
courses.gkcodelabs.com/produc...
Spark + SCALA course available at
courses.gkcodelabs.com/produc...
End to End pipeline Introduction Videos:
Pyspark End to End Pipeline
• BIG DATA COMPLETE PROJ...
Spark + Scala End to End Pipeline
• BIG DATA complete PROJ...
Starter Pack available at just: ₹549 (For Indian Payments) or $9 (For non-Indian payments)
Extended Pack available at just: ₹1299 (For Indian Payments) or $19 (For non-Indian payments)
Queries? Write to us at: support@gkcodelabs.com
Website: www.gkcodelabs.com

Комментарии • 5

@johnsonrajendran6194 2 года назад
Nice explanation!!
@user-lp7sb5dw7l 7 месяцев назад
When you do repartition and then partitionby already data is partitioned now based on partitionby column they why no of part file depend on repartition() again?
@srikanthk8261 2 года назад ⁺²
Good explanation. I have question as you mentioned when your doing partition by age columns that will creating 3 partitions bcoz we have three age groups here. Let's assume I have 1000 unique Ids in a dataset. I have provided partition by Id column then how many partition it will create. On which basis it will create partitions. Could you please brief about this if you have time.
Thanks
Srikanth kita
@GKCodelabs 2 года назад ⁺²
Good catch 😊! I will try to answer this, in as simple way as possible, but it will have some conditions 😉 (distributed computing always has a lot to given's and provided's) 😜
So for your case:
It will be 1000 partitions (condition: You should have 1000+ cores on your cluster)
Else it will be equal to your number of cores (condition: Each core could handle the amount of data which it is processing)
Else it can be slightly more than your number of cores, in case some cores were not able to processes the data given to them, and processed rest of it in next cycle (task).
Hope i was able to answer your question.!
@MiRayalaseemaPillakai 2 года назад
1st view

Следующие

Автовоспроизведение

Issues in Big Data Projects | Interview Question | 10 Issues Answered

Issues in Big Data Projects | Interview Question | 10 Issues Answered

Data Security Strategies in Data Pipelines | Apache Spark | Best Practices

Data Security Strategies in Data Pipelines | Apache Spark | Best Practices

Spark - Repartition Or Coalesce

Spark - Repartition Or Coalesce

EA SPORTS FC 25 | Official Gameplay Deep Dive

EA SPORTS FC 25 | Official Gameplay Deep Dive

Bridgerton FINALLY made me CRY (season 3)

Bridgerton FINALLY made me CRY (season 3)

EBK Jaaybo - Vulture (Official Music Video)

EBK Jaaybo - Vulture (Official Music Video)

Passive infrared motion sensors: a two-bit camera powered by crystals

Passive infrared motion sensors: a two-bit camera powered by crystals

Spark Session vs Spark Context | Spark Internals

Spark Session vs Spark Context | Spark Internals

Bangalore Traffic | Marathahalli Bridge

Bangalore Traffic | Marathahalli Bridge

Pyspark Scenarios 20 : difference between coalesce and repartition in pyspark #coalesce #repartition

Pyspark Scenarios 20 : difference between coalesce and repartition in pyspark #coalesce #repartition

Apache Spark interview questions & Points to remember-Part 1 | Session-19

Apache Spark interview questions & Points to remember-Part 1 | Session-19

Pyspark Tutorials 3 | pandas vs pyspark || what is rdd in spark || Features of RDD

Pyspark Tutorials 3 | pandas vs pyspark || what is rdd in spark || Features of RDD

23. Databricks | Spark | Cache vs Persist | Interview Question | Performance Tuning

23. Databricks | Spark | Cache vs Persist | Interview Question | Performance Tuning

Apache Spark Memory Management | Unified Memory Management

Apache Spark Memory Management | Unified Memory Management

Large Data Migration in AWS Cloud: Challenges & Solutions

Large Data Migration in AWS Cloud: Challenges & Solutions

🤩СДЕЛАЛ МАМУ ДОБРОЙ В SCHOOLBOY RUNAWAY ! #roblox #shorts #zengi #projectfight

🤩СДЕЛАЛ МАМУ ДОБРОЙ В SCHOOLBOY RUNAWAY ! #roblox #shorts #zengi #projectfight

Sardor Rahimxon- ERKIN KOMILOV UMUMAN NOTANISH AYOLGA 30 ming $ BERVORDI. SHOK.

Sardor Rahimxon- ERKIN KOMILOV UMUMAN NOTANISH AYOLGA 30 ming $ BERVORDI. SHOK.

Угадай МОБА 1 🥵 | WICSUR #shorts

Угадай МОБА 1 🥵 | WICSUR #shorts

Эти работы НЕ ПОД СИЛУ обычному человеку #апвоут #реддит #апвоутистории

Эти работы НЕ ПОД СИЛУ обычному человеку #апвоут #реддит #апвоутистории

QVZ PREMYER LIGA

QVZ PREMYER LIGA

А вы бы сколько попросили? Не переборщил?

А вы бы сколько попросили? Не переборщил?

Inside Out 2: Who is the strongest? Joy vs Envy vs Anger #shorts #animation

Inside Out 2: Who is the strongest? Joy vs Envy vs Anger #shorts #animation

ИГРАЮ БЕЗ ПАЛЬЦЕВ ВСЕ КАТКИ НА ТОП 1😱С ВЕБКОЙ В ПАБГ МОБАЙЛ 💛PUBG MOBILE

ИГРАЮ БЕЗ ПАЛЬЦЕВ ВСЕ КАТКИ НА ТОП 1😱С ВЕБКОЙ В ПАБГ МОБАЙЛ 💛PUBG MOBILE