Shilpa DataInsights
Shilpa DataInsights
  • Видео 205
  • Просмотров 306 797
Most Asked PySpark Interview Questions: Mastering explode() Function
Looking to ace your PySpark interview at top consulting firms or data-driven companies? One of the most commonly asked concepts is the explode() function-a powerful tool for handling nested and array data structures.
In this video, we cover:
✅ What is the explode() function in PySpark?
✅ How explode() helps transform complex data types into a structured format.
✅ Real-world examples and scenarios where explode() is essential.
✅ How to answer interview questions related to explode() with confidence.
If you're preparing for a Big 4 consulting interview (Deloitte, PwC, EY, KPMG) or aiming for a data engineering or analytics role, this video will help you strengthen your PySpark skills.
Subscribe to...
Просмотров: 49

Видео

Mastering Sort Functions in PySpark: Optimize Your Data Processing
Просмотров 5416 часов назад
In this video, I dive into the key sort functions in PySpark, demonstrating how to effectively order your data to enhance your Spark workflows. Understanding these sorting techniques is essential for improving performance and making data retrieval more efficient. In this video, we cover: ✅ What are Sort Functions in PySpark? ✅ Syntax and practical use cases ✅ Key differences between ascending a...
Big 4 Consulting Interview Questions (Part 4): Master Join, GroupBy, DateFormat & Aggregation
Просмотров 129День назад
Welcome to Part 4 of our Big 4 Consulting Interview Prep series! 🚀 In this episode, we dive deep into some of the most commonly asked PySpark functions in Big 4 consulting firm interviews, including join, groupBy, date formatting, and aggregation functions. What you’ll learn in this video: How to use join to merge datasets effectively and handle real-world data relationships. Mastering groupBy ...
Big 4 Deloitte Data Engineering Interview Questions Part 3: Pyspark Functions Explained
Просмотров 36814 дней назад
Big 4 Deloitte Data Engineering Interview Questions Part 3: Pyspark Functions Explained
Big 4 Data Engineering Interview Questions Part 2: PySpark Functions Explained
Просмотров 12421 день назад
Big 4 Data Engineering Interview Questions Part 2: PySpark Functions Explained
Big 4 Data Engineering Interview Questions Part 1: PySpark Functions Explained
Просмотров 10928 дней назад
Big 4 Data Engineering Interview Questions Part 1: PySpark Functions Explained
Master COLLECT_SET and COLLECT_LIST in Spark | Shilpa Data Insights
Просмотров 180Месяц назад
Master COLLECT_SET and COLLECT_LIST in Spark | Shilpa Data Insights
Data Engineering Roadmap 2025 | Easiest Roadmap | 10X Your Salary | Become Top 1% Data Engineer
Просмотров 1,1 тыс.Месяц назад
Data Engineering Roadmap 2025 | Easiest Roadmap | 10X Your Salary | Become Top 1% Data Engineer
Master LAG & LEAD Functions in SQL Window Functions | Step-by-Step Guide using SQL and Pyspark
Просмотров 138Месяц назад
Master LAG & LEAD Functions in SQL Window Functions | Step-by-Step Guide using SQL and Pyspark
Null handling in pySpark DataFrame | isNull function in pyspark | isNotNull function in pyspark
Просмотров 160Месяц назад
Null handling in pySpark DataFrame | isNull function in pyspark | isNotNull function in pyspark
Master RANK, DENSE_RANK & ROW_NUMBER | Window Functions in SQL/Pyspark Explained! | Databricks
Просмотров 210Месяц назад
Master RANK, DENSE_RANK & ROW_NUMBER | Window Functions in SQL/Pyspark Explained! | Databricks
Master Aggregated Functions in SQL: MIN, MAX, AVG, and COUNT
Просмотров 89Месяц назад
Master Aggregated Functions in SQL: MIN, MAX, AVG, and COUNT
Coding Interview question| Employees Earning Above Average Salary in Each Department| Pyspark & SQL
Просмотров 1582 месяца назад
Coding Interview question| Employees Earning Above Average Salary in Each Department| Pyspark & SQL
Databricks | PySpark| SQL Coding Interview: Employees Earning More than Managers
Просмотров 1932 месяца назад
Databricks | PySpark| SQL Coding Interview: Employees Earning More than Managers
Complete Roadmap to become Azure Data Engineer: Best Certification Roadmap 2025
Просмотров 9432 месяца назад
Complete Roadmap to become Azure Data Engineer: Best Certification Roadmap 2025
Databricks| Pyspark| Coding Question: Pyspark and Spark SQL| Scenario Based Interview Question
Просмотров 2752 месяца назад
Databricks| Pyspark| Coding Question: Pyspark and Spark SQL| Scenario Based Interview Question
The Ultimate Python Roadmap 2025 (Before You Start). Fastest way to learn python programming
Просмотров 6292 месяца назад
The Ultimate Python Roadmap 2025 (Before You Start). Fastest way to learn python programming
Master PySpark Date & Time Functions | current_date, date_add, datediff, and More
Просмотров 2072 месяца назад
Master PySpark Date & Time Functions | current_date, date_add, datediff, and More
How to handle duplicate data in PySpark: A Step-by-Step Guide to Clean Data Efficiently!"
Просмотров 2663 месяца назад
How to handle duplicate data in PySpark: A Step-by-Step Guide to Clean Data Efficiently!"
Databricks | Pyspark| Spark SQL: Except Columns in Select Clause
Просмотров 1543 месяца назад
Databricks | Pyspark| Spark SQL: Except Columns in Select Clause
split function in pyspark | pyspark advanced tutorial | getitem in pyspark | databricks tutorial
Просмотров 1393 месяца назад
split function in pyspark | pyspark advanced tutorial | getitem in pyspark | databricks tutorial
Databricks | Pyspark | UDF to Check if Folder Exists
Просмотров 1374 месяца назад
Databricks | Pyspark | UDF to Check if Folder Exists
Databricks Workshop Autoloader and copy into: Step by Step guide with demo| Pyspark
Просмотров 2504 месяца назад
Databricks Workshop Autoloader and copy into: Step by Step guide with demo| Pyspark
Databricks Workshop DLT Pipelines: Step by Step guide creating DLT pipeline with demo| Pyspark
Просмотров 3624 месяца назад
Databricks Workshop DLT Pipelines: Step by Step guide creating DLT pipeline with demo| Pyspark
Databricks Tutorial: Unity catalog, Secret Scope, RBAC| Hands-On Training |Workshop |Demo
Просмотров 4544 месяца назад
Databricks Tutorial: Unity catalog, Secret Scope, RBAC| Hands-On Training |Workshop |Demo
Databricks Tutorial: Lakehouse Architecture, Delta Tables, Time Travel ,Optimize | Hands-On Training
Просмотров 4755 месяцев назад
Databricks Tutorial: Lakehouse Architecture, Delta Tables, Time Travel ,Optimize | Hands-On Training
Databricks working with DBFS, Magic command and dbutils |Databricks Workshop Part 2
Просмотров 4955 месяцев назад
Databricks working with DBFS, Magic command and dbutils |Databricks Workshop Part 2
Databricks for Beginner Setup, Architecture, Notebooks, Library| PySpark Tutorial
Просмотров 8025 месяцев назад
Databricks for Beginner Setup, Architecture, Notebooks, Library| PySpark Tutorial
Databricks DLT Pipelines: A Guide to step by step creating DLT pipeline with demo| Pyspark
Просмотров 6476 месяцев назад
Databricks DLT Pipelines: A Guide to step by step creating DLT pipeline with demo| Pyspark
Databricks ingestion using Copyinto |Databricks | Pyspark| Incremental Data Load |Copy Into
Просмотров 5216 месяцев назад
Databricks ingestion using Copyinto |Databricks | Pyspark| Incremental Data Load |Copy Into

Комментарии

  • @rahulkumarsharma7354
    @rahulkumarsharma7354 День назад

    Hi Shilpa.. Nice explanation.. But how will you differentiate between DAG and Lineage Graph? Many platform say DAG is logical plan ,on other hand few say DAG is physical plan..

  • @rawat7203
    @rawat7203 2 дня назад

    Hi Mam, cpuld you also add script to create dataframe Thank you

    • @ShilpaDataInsights
      @ShilpaDataInsights 22 часа назад

      Please refer to the script for creating dataframe: sample_data = [("Kolkata","","WB"), ("","Gurgaon",None), (None,"","banaglore")] columns= ["city1","city2","city3"] df_city = spark.createDataFrame(sample_data,schema = columns) df_city.display()

  • @rawat7203
    @rawat7203 3 дня назад

    Hi Mam, Can you also paste the data in description so that we can also practice

    • @ShilpaDataInsights
      @ShilpaDataInsights 22 часа назад

      Hi Please use the below script for dataframe creation: data = [("01-06-2020", "Booked"), ("02-06-2020", "Booked"), ("03-06-2020", "Booked"), ("04-06-2020", "Available"), ("05-06-2020", "Available"), ("06-06-2020", "Available"), ("07-06-2020", "Booked")] schema = StructType([ StructField("show_date", StringType(), True), StructField("show_status", StringType(), True) ]) spark = SparkSession.builder.appName("Solution").getOrCreate() df = spark.createDataFrame(data, schema)

  • @gudiatoka
    @gudiatoka 4 дня назад

    Keep inspiring us 🎉

  • @rajruban637
    @rajruban637 7 дней назад

    C++ is faster in execution compared to python. Python can be written quick but code execution is slower than c++ 😊

  • @suravikalita6100
    @suravikalita6100 7 дней назад

    This is not funny at all.

  • @lipsadas2236
    @lipsadas2236 11 дней назад

    Very creative 🤩

  • @lipsadas2236
    @lipsadas2236 11 дней назад

    😂😂

  • @Yaswanth
    @Yaswanth 18 дней назад

    give df in comments

    • @ShilpaDataInsights
      @ShilpaDataInsights 22 часа назад

      Here is the code to create the dataframe: data=[(1,'John','ADF'),(1,'John','ADB'),(1,'John','PowerBI'),(2,'Steve','ADF'),(2,'Steve','SQL'),(2,'Steve','Crystal Report'),(3,'James','ADF'),(3,'James','SQL'),(3,'James','SSIS'),(4,'Acey','SQL'),(4,'Acey','SSIS'),(4,'Acey','SSIS'),(4,'Acey','ADF')] schema=["EmpId","EmpName","Skill"] df1=spark.createDataFrame(data,schema)

  • @NommyNommyNomNom
    @NommyNommyNomNom 19 дней назад

    Thanks for these videos Shilpa. I find them very helpful. I just started a new job as a data engineer after working as a java web developer and this is really useful for me.

    • @ShilpaDataInsights
      @ShilpaDataInsights 22 часа назад

      That's great! Congratulations . Thank you I am glad you found it helpful. if there is any topic that you want me to cover please let me know.😊

  • @suryateja5323
    @suryateja5323 20 дней назад

    Can Auto loader support Delta Tables if any insert or update or delete happens on Delta table ,can it trigger an dliad to some log table ??? If not What is the other way to log these ops just like Sql trigger . Enlighten me please😢😢

  • @gudiatoka
    @gudiatoka 28 дней назад

    Great..keep inspiring 🎉

  • @somerandomfatguy.3384
    @somerandomfatguy.3384 29 дней назад

    Sweet and effective.

  • @kapilrana4043
    @kapilrana4043 Месяц назад

    Mam i wanted to know that data engineer is still relevant learning in 2025 as most of fhe folks are saying that in data engineer role no jobs are there for freshers in india as well as in european countries kindly put some light on this issue ?

    • @ShilpaDataInsights
      @ShilpaDataInsights 28 дней назад

      Data engineering is relevant. It will be relevant till we are dealing with data.

  • @coolraviraj24
    @coolraviraj24 Месяц назад

    I am here to thank you as I referred your Databricks certification playlist. I did watch the questions series.. and I did get to know few concepts. Scored 100% in the exam so thank you and happy new year❤

    • @ShilpaDataInsights
      @ShilpaDataInsights 28 дней назад

      Happy to hear that Databricks certification playlist helped you and hearty congratulations for clearing the exam with such an amazing score 😀 !! Keep learning and growing !

  • @sam45326
    @sam45326 Месяц назад

    Thank you for this roadmap! This is very helpful.

  • @napoleanbonaparte9225
    @napoleanbonaparte9225 Месяц назад

    Mam, I am regular follower of ur videos, pls add the data in the video for our practise, i am thankful for ur useful content which made me understand very well, even after subscribing several course in udemy

    • @ShilpaDataInsights
      @ShilpaDataInsights 28 дней назад

      Glad it helped you😀. I have added the dataset with complete code. Please find my GitHub link: github.com/shilpadata/pyspark_databricks/tree/b5dfa02ed45af980153fb258ba8d308f0654a128

  • @AllInOne-dn9cb
    @AllInOne-dn9cb Месяц назад

    Movie name

  • @napoleanbonaparte9225
    @napoleanbonaparte9225 Месяц назад

    I am thankful to you mam,as u r fantastic mentor in pyspark, no other youtuber or udemy tutor has this level of understading, kudos to you. but pls upload the dataset that u r working upon.I have tried to upload similar kind of datasert for previous vidoes , by the time I try to understand something is side tracked as data in my case behaves differently.

    • @ShilpaDataInsights
      @ShilpaDataInsights 28 дней назад

      The dataset with the code is present in the GITHUB link: github.com/shilpadata/pyspark_databricks/blob/main/Window_function_Lag_Lead.py You can fork from the above link.

  • @aldodemobr5353
    @aldodemobr5353 Месяц назад

    😂😂😂😂😂😂😂😂😂😂😂

  • @sam45326
    @sam45326 Месяц назад

    Great video! Thanks

  • @antoniospr14
    @antoniospr14 Месяц назад

    😂😂

  • @napoleanbonaparte9225
    @napoleanbonaparte9225 Месяц назад

    Where have you been there all these tough days, pretty lady you are gem of person in Pyspark which I searched a lot in youtube & joined 3 udemy courses, literally I feel u are angel who came to help us in pysaprk.

    • @ShilpaDataInsights
      @ShilpaDataInsights 28 дней назад

      The dataset with the code is present in the GITHUB link: github.com/shilpadata/pyspark_databricks/blob/b5dfa02ed45af980153fb258ba8d308f0654a128/Date_time_Functions.py You can fork or download from the above link.

  • @vineetasahu3735
    @vineetasahu3735 Месяц назад

    Relatability on its peak 📈

  • @ss-cw6he
    @ss-cw6he Месяц назад

    Tester ki fati 😂

  • @ambadibs4702
    @ambadibs4702 Месяц назад

    True 😂

  • @Sravanreddy143
    @Sravanreddy143 Месяц назад

    The way you explained very easily understandable and interesting. Thank you so much. 🎉 And please try to do continue more videos playlists step by step Pyspark, ADF, Databricks, SQL more

    • @ShilpaDataInsights
      @ShilpaDataInsights 28 дней назад

      Thank you. Sure I will be coming up with more videos!!

  • @itsranjan2003
    @itsranjan2003 Месяц назад

    Nice explanation..

  • @Royal_jaat166
    @Royal_jaat166 Месяц назад

    Movie name

  • @igorbatista6902
    @igorbatista6902 Месяц назад

    Thank you for the very well explained Video!

  • @deeiconic_world
    @deeiconic_world Месяц назад

    Very well explained 🙌🏻

  • @TejashKumar-sc5wi
    @TejashKumar-sc5wi Месяц назад

    So dangerous - JavaScript 😅❤😊

  • @deeiconic_world
    @deeiconic_world Месяц назад

    Very well explained👌🏻👌🏻

  • @Death_Note0_0
    @Death_Note0_0 Месяц назад

    😂😂

  • @VijayKumar-qr9hf
    @VijayKumar-qr9hf Месяц назад

    😂😂

  • @MukulMishra-ge3
    @MukulMishra-ge3 Месяц назад

    Nice save

  • @kanishksanger847
    @kanishksanger847 Месяц назад

    🤣🤣🤣🤣🤣🤣🤣

  • @sam45326
    @sam45326 Месяц назад

    :D

  • @prashantrana1591
    @prashantrana1591 2 месяца назад

    Really loving your creativity… Amazing work.. keep it up..

    • @ShilpaDataInsights
      @ShilpaDataInsights 28 дней назад

      Glad you liked it, keep an eye out for more creative content!

  • @R-v3x-q4f
    @R-v3x-q4f 2 месяца назад

    Cuanto dura el token cuando lo tengo que renovar

  • @deepakrawat418
    @deepakrawat418 2 месяца назад

    Hi ma'am can you suggest a python playlist for data engineer

    • @ShilpaDataInsights
      @ShilpaDataInsights 2 месяца назад

      Please checkout the roadmap: ruclips.net/video/9oVsSEUDePE/видео.html .I will be creating python playlist soon.

  • @sam45326
    @sam45326 2 месяца назад

    Thank you for this interview qna. Very helpful!

  • @sam45326
    @sam45326 2 месяца назад

    😂😂😂

  • @gauravsharma5568
    @gauravsharma5568 2 месяца назад

    very nice

  • @itsranjan2003
    @itsranjan2003 2 месяца назад

    Nice explanation..

  • @sribatsadas3742
    @sribatsadas3742 2 месяца назад

    Amazing roadmap to follow Thank you 😊

  • @sam45326
    @sam45326 2 месяца назад

    Always 😂😂😂

  • @sam45326
    @sam45326 2 месяца назад

    Thanks for this interview playlist..

  • @lipsadas2236
    @lipsadas2236 2 месяца назад

    My name is Parle-G

  • @lipsadas2236
    @lipsadas2236 2 месяца назад

    My name is Parle-G