Cleansing the CSV data and processing in Pyspark| Scenario based question| Spark Interview Questions
HTML-код
- Опубликовано: 4 фев 2022
- Hi Friends,
Sample code is checked into GitHub:
github.com/sravanapisupati/Sa...
In this video, I have explained the procedure for reading a csv file and processing it using PySpark.
The CSV has multiple lines present for a single Id and has uneven columns ( different number of columns for each row).
Please subscribe to my channel for more interesting learnings. Наука
Excellent
Your tutorials are simply special Sravana!!
Thanks a lot, Sudip.
Superb, everyone can easily understand 👍 👏
Thanks a lot, Sravan 😊👍
amazing vide. Now i know where i am wrong. thx for the video.
Thanks a lot:)
Thanks for the information
Thanks a lot
excellent lesson!
Thanks a lot.
Please do more videos scenario based on pyspark .current project using pyspark we r doing transformations in ADB , adf only FOR data movement only.
Sure, Sravan.
Good Explanation.👍
Thank you:)
@sparklingFuture
why cant we use pivot and filter data on top of it it will be single liner right?
Your videos are awesome with more advance approach but pls upgrade your audio system. Its request.. 🙏
Thanks a lot, Akash. Sure, thanks for the feedback. I'll take care of this..
can you please this scenario how to Load CSV file in to JSON with Nested Hierarchy using pyspark in ADB like custid, custname, itemname,quanity this csv when we convert to nested json custid, custname, purchases { itemname : book, quantity : 2} like one customer buy multiple items
Sure Sravan.
@@sravanalakshmipisupati6533 Thank you
@Sravan, please check the video ruclips.net/video/zhbMKX9eIPM/видео.html
hello...can you please confirm when you first extracted data from CSV where did you mention the column names. how did the column names generate in the show command
Hi Rajani, I have given the header in the input file and used the option of header to true for displaying the header in show()
@@sravanalakshmipisupati6533 thank you so much. Also can you please post more videos on ingesting and transforming data from/to on - premises databases and other cloud storages
@@rajanib9057 Thankyou. For the on premise, already most of the ingestion with different file formats are covered. Please check my videos - github.com/sravanapisupati/SampleDataSet/blob/main/RUclips_videos_list
How to Merge Spark DataFrame - Complex type if we have two json files json 1 schema and json2 schema is differenr how can we merge using pyspark. can you please explain this scenario.
Sure Sravan, I will get back to you soon for all your questions. Could you please share sample data?
@@sravanalakshmipisupati6533 ok
Hi Sravan, You can read 2 JSON files separately as 2 Dataframes and then join them. If this is not what you are looking for, then please give me the detailed problem statement with some sample data. Thanks.