![Shilpa DataInsights](/img/default-banner.jpg)
- Видео 205
- Просмотров 306 797
Shilpa DataInsights
Индия
Добавлен 19 мар 2023
Join us for tutorials and the latest trends. Whether you're looking to grasp the fundamentals or seeking advanced techniques, in this channel you will get the comprehensive tutorials and guides.
Subscribe now for an exciting data journey!
The mission is to build a vibrant community of data enthusiasts, where knowledge is freely shared, and collaboration thrives.
Happy Learning!
Subscribe now for an exciting data journey!
The mission is to build a vibrant community of data enthusiasts, where knowledge is freely shared, and collaboration thrives.
Happy Learning!
Most Asked PySpark Interview Questions: Mastering explode() Function
Looking to ace your PySpark interview at top consulting firms or data-driven companies? One of the most commonly asked concepts is the explode() function-a powerful tool for handling nested and array data structures.
In this video, we cover:
✅ What is the explode() function in PySpark?
✅ How explode() helps transform complex data types into a structured format.
✅ Real-world examples and scenarios where explode() is essential.
✅ How to answer interview questions related to explode() with confidence.
If you're preparing for a Big 4 consulting interview (Deloitte, PwC, EY, KPMG) or aiming for a data engineering or analytics role, this video will help you strengthen your PySpark skills.
Subscribe to...
In this video, we cover:
✅ What is the explode() function in PySpark?
✅ How explode() helps transform complex data types into a structured format.
✅ Real-world examples and scenarios where explode() is essential.
✅ How to answer interview questions related to explode() with confidence.
If you're preparing for a Big 4 consulting interview (Deloitte, PwC, EY, KPMG) or aiming for a data engineering or analytics role, this video will help you strengthen your PySpark skills.
Subscribe to...
Просмотров: 49
Видео
Mastering Sort Functions in PySpark: Optimize Your Data Processing
Просмотров 5416 часов назад
In this video, I dive into the key sort functions in PySpark, demonstrating how to effectively order your data to enhance your Spark workflows. Understanding these sorting techniques is essential for improving performance and making data retrieval more efficient. In this video, we cover: ✅ What are Sort Functions in PySpark? ✅ Syntax and practical use cases ✅ Key differences between ascending a...
Big 4 Consulting Interview Questions (Part 4): Master Join, GroupBy, DateFormat & Aggregation
Просмотров 129День назад
Welcome to Part 4 of our Big 4 Consulting Interview Prep series! 🚀 In this episode, we dive deep into some of the most commonly asked PySpark functions in Big 4 consulting firm interviews, including join, groupBy, date formatting, and aggregation functions. What you’ll learn in this video: How to use join to merge datasets effectively and handle real-world data relationships. Mastering groupBy ...
Big 4 Deloitte Data Engineering Interview Questions Part 3: Pyspark Functions Explained
Просмотров 36814 дней назад
Big 4 Deloitte Data Engineering Interview Questions Part 3: Pyspark Functions Explained
Big 4 Data Engineering Interview Questions Part 2: PySpark Functions Explained
Просмотров 12421 день назад
Big 4 Data Engineering Interview Questions Part 2: PySpark Functions Explained
Big 4 Data Engineering Interview Questions Part 1: PySpark Functions Explained
Просмотров 10928 дней назад
Big 4 Data Engineering Interview Questions Part 1: PySpark Functions Explained
Master COLLECT_SET and COLLECT_LIST in Spark | Shilpa Data Insights
Просмотров 180Месяц назад
Master COLLECT_SET and COLLECT_LIST in Spark | Shilpa Data Insights
Data Engineering Roadmap 2025 | Easiest Roadmap | 10X Your Salary | Become Top 1% Data Engineer
Просмотров 1,1 тыс.Месяц назад
Data Engineering Roadmap 2025 | Easiest Roadmap | 10X Your Salary | Become Top 1% Data Engineer
Master LAG & LEAD Functions in SQL Window Functions | Step-by-Step Guide using SQL and Pyspark
Просмотров 138Месяц назад
Master LAG & LEAD Functions in SQL Window Functions | Step-by-Step Guide using SQL and Pyspark
Null handling in pySpark DataFrame | isNull function in pyspark | isNotNull function in pyspark
Просмотров 160Месяц назад
Null handling in pySpark DataFrame | isNull function in pyspark | isNotNull function in pyspark
Master RANK, DENSE_RANK & ROW_NUMBER | Window Functions in SQL/Pyspark Explained! | Databricks
Просмотров 210Месяц назад
Master RANK, DENSE_RANK & ROW_NUMBER | Window Functions in SQL/Pyspark Explained! | Databricks
Master Aggregated Functions in SQL: MIN, MAX, AVG, and COUNT
Просмотров 89Месяц назад
Master Aggregated Functions in SQL: MIN, MAX, AVG, and COUNT
Coding Interview question| Employees Earning Above Average Salary in Each Department| Pyspark & SQL
Просмотров 1582 месяца назад
Coding Interview question| Employees Earning Above Average Salary in Each Department| Pyspark & SQL
Databricks | PySpark| SQL Coding Interview: Employees Earning More than Managers
Просмотров 1932 месяца назад
Databricks | PySpark| SQL Coding Interview: Employees Earning More than Managers
Complete Roadmap to become Azure Data Engineer: Best Certification Roadmap 2025
Просмотров 9432 месяца назад
Complete Roadmap to become Azure Data Engineer: Best Certification Roadmap 2025
Databricks| Pyspark| Coding Question: Pyspark and Spark SQL| Scenario Based Interview Question
Просмотров 2752 месяца назад
Databricks| Pyspark| Coding Question: Pyspark and Spark SQL| Scenario Based Interview Question
The Ultimate Python Roadmap 2025 (Before You Start). Fastest way to learn python programming
Просмотров 6292 месяца назад
The Ultimate Python Roadmap 2025 (Before You Start). Fastest way to learn python programming
Master PySpark Date & Time Functions | current_date, date_add, datediff, and More
Просмотров 2072 месяца назад
Master PySpark Date & Time Functions | current_date, date_add, datediff, and More
How to handle duplicate data in PySpark: A Step-by-Step Guide to Clean Data Efficiently!"
Просмотров 2663 месяца назад
How to handle duplicate data in PySpark: A Step-by-Step Guide to Clean Data Efficiently!"
Databricks | Pyspark| Spark SQL: Except Columns in Select Clause
Просмотров 1543 месяца назад
Databricks | Pyspark| Spark SQL: Except Columns in Select Clause
split function in pyspark | pyspark advanced tutorial | getitem in pyspark | databricks tutorial
Просмотров 1393 месяца назад
split function in pyspark | pyspark advanced tutorial | getitem in pyspark | databricks tutorial
Databricks | Pyspark | UDF to Check if Folder Exists
Просмотров 1374 месяца назад
Databricks | Pyspark | UDF to Check if Folder Exists
Databricks Workshop Autoloader and copy into: Step by Step guide with demo| Pyspark
Просмотров 2504 месяца назад
Databricks Workshop Autoloader and copy into: Step by Step guide with demo| Pyspark
Databricks Workshop DLT Pipelines: Step by Step guide creating DLT pipeline with demo| Pyspark
Просмотров 3624 месяца назад
Databricks Workshop DLT Pipelines: Step by Step guide creating DLT pipeline with demo| Pyspark
Databricks Tutorial: Unity catalog, Secret Scope, RBAC| Hands-On Training |Workshop |Demo
Просмотров 4544 месяца назад
Databricks Tutorial: Unity catalog, Secret Scope, RBAC| Hands-On Training |Workshop |Demo
Databricks Tutorial: Lakehouse Architecture, Delta Tables, Time Travel ,Optimize | Hands-On Training
Просмотров 4755 месяцев назад
Databricks Tutorial: Lakehouse Architecture, Delta Tables, Time Travel ,Optimize | Hands-On Training
Databricks working with DBFS, Magic command and dbutils |Databricks Workshop Part 2
Просмотров 4955 месяцев назад
Databricks working with DBFS, Magic command and dbutils |Databricks Workshop Part 2
Databricks for Beginner Setup, Architecture, Notebooks, Library| PySpark Tutorial
Просмотров 8025 месяцев назад
Databricks for Beginner Setup, Architecture, Notebooks, Library| PySpark Tutorial
Databricks DLT Pipelines: A Guide to step by step creating DLT pipeline with demo| Pyspark
Просмотров 6476 месяцев назад
Databricks DLT Pipelines: A Guide to step by step creating DLT pipeline with demo| Pyspark
Databricks ingestion using Copyinto |Databricks | Pyspark| Incremental Data Load |Copy Into
Просмотров 5216 месяцев назад
Databricks ingestion using Copyinto |Databricks | Pyspark| Incremental Data Load |Copy Into
Hi Shilpa.. Nice explanation.. But how will you differentiate between DAG and Lineage Graph? Many platform say DAG is logical plan ,on other hand few say DAG is physical plan..
Hi Mam, cpuld you also add script to create dataframe Thank you
Please refer to the script for creating dataframe: sample_data = [("Kolkata","","WB"), ("","Gurgaon",None), (None,"","banaglore")] columns= ["city1","city2","city3"] df_city = spark.createDataFrame(sample_data,schema = columns) df_city.display()
Hi Mam, Can you also paste the data in description so that we can also practice
Hi Please use the below script for dataframe creation: data = [("01-06-2020", "Booked"), ("02-06-2020", "Booked"), ("03-06-2020", "Booked"), ("04-06-2020", "Available"), ("05-06-2020", "Available"), ("06-06-2020", "Available"), ("07-06-2020", "Booked")] schema = StructType([ StructField("show_date", StringType(), True), StructField("show_status", StringType(), True) ]) spark = SparkSession.builder.appName("Solution").getOrCreate() df = spark.createDataFrame(data, schema)
Keep inspiring us 🎉
Thank you !!!!
C++ is faster in execution compared to python. Python can be written quick but code execution is slower than c++ 😊
This is not funny at all.
Very creative 🤩
Thank you! Cheers!
😂😂
give df in comments
Here is the code to create the dataframe: data=[(1,'John','ADF'),(1,'John','ADB'),(1,'John','PowerBI'),(2,'Steve','ADF'),(2,'Steve','SQL'),(2,'Steve','Crystal Report'),(3,'James','ADF'),(3,'James','SQL'),(3,'James','SSIS'),(4,'Acey','SQL'),(4,'Acey','SSIS'),(4,'Acey','SSIS'),(4,'Acey','ADF')] schema=["EmpId","EmpName","Skill"] df1=spark.createDataFrame(data,schema)
Thanks for these videos Shilpa. I find them very helpful. I just started a new job as a data engineer after working as a java web developer and this is really useful for me.
That's great! Congratulations . Thank you I am glad you found it helpful. if there is any topic that you want me to cover please let me know.😊
Can Auto loader support Delta Tables if any insert or update or delete happens on Delta table ,can it trigger an dliad to some log table ??? If not What is the other way to log these ops just like Sql trigger . Enlighten me please😢😢
Great..keep inspiring 🎉
Thank you !!
Sweet and effective.
Mam i wanted to know that data engineer is still relevant learning in 2025 as most of fhe folks are saying that in data engineer role no jobs are there for freshers in india as well as in european countries kindly put some light on this issue ?
Data engineering is relevant. It will be relevant till we are dealing with data.
I am here to thank you as I referred your Databricks certification playlist. I did watch the questions series.. and I did get to know few concepts. Scored 100% in the exam so thank you and happy new year❤
Happy to hear that Databricks certification playlist helped you and hearty congratulations for clearing the exam with such an amazing score 😀 !! Keep learning and growing !
Thank you for this roadmap! This is very helpful.
Glad it was helpful!
Mam, I am regular follower of ur videos, pls add the data in the video for our practise, i am thankful for ur useful content which made me understand very well, even after subscribing several course in udemy
Glad it helped you😀. I have added the dataset with complete code. Please find my GitHub link: github.com/shilpadata/pyspark_databricks/tree/b5dfa02ed45af980153fb258ba8d308f0654a128
Movie name
I am thankful to you mam,as u r fantastic mentor in pyspark, no other youtuber or udemy tutor has this level of understading, kudos to you. but pls upload the dataset that u r working upon.I have tried to upload similar kind of datasert for previous vidoes , by the time I try to understand something is side tracked as data in my case behaves differently.
The dataset with the code is present in the GITHUB link: github.com/shilpadata/pyspark_databricks/blob/main/Window_function_Lag_Lead.py You can fork from the above link.
😂😂😂😂😂😂😂😂😂😂😂
Great video! Thanks
😂😂
Where have you been there all these tough days, pretty lady you are gem of person in Pyspark which I searched a lot in youtube & joined 3 udemy courses, literally I feel u are angel who came to help us in pysaprk.
The dataset with the code is present in the GITHUB link: github.com/shilpadata/pyspark_databricks/blob/b5dfa02ed45af980153fb258ba8d308f0654a128/Date_time_Functions.py You can fork or download from the above link.
Relatability on its peak 📈
Tester ki fati 😂
True 😂
The way you explained very easily understandable and interesting. Thank you so much. 🎉 And please try to do continue more videos playlists step by step Pyspark, ADF, Databricks, SQL more
Thank you. Sure I will be coming up with more videos!!
Nice explanation..
Thank you !!
Movie name
Khatta meetha movie yt me mil jayega
Thank you for the very well explained Video!
Thank you !!
Very well explained 🙌🏻
Thank you 🙂
So dangerous - JavaScript 😅❤😊
so true
Very well explained👌🏻👌🏻
Thanks a lot 😊
😂😂
😂😂
Nice save
Indeed
🤣🤣🤣🤣🤣🤣🤣
:D
Really loving your creativity… Amazing work.. keep it up..
Glad you liked it, keep an eye out for more creative content!
Cuanto dura el token cuando lo tengo que renovar
Hi ma'am can you suggest a python playlist for data engineer
Please checkout the roadmap: ruclips.net/video/9oVsSEUDePE/видео.html .I will be creating python playlist soon.
Thank you for this interview qna. Very helpful!
My pleasure!
😂😂😂
very nice
I'm glad you liked it!
Nice explanation..
Thank you
Amazing roadmap to follow Thank you 😊
Always 😂😂😂
Thanks for this interview playlist..
Thank you
My name is Parle-G
My name is Parle-G
😊😂