AWS Tutorials - Joining Datasets in AWS Glue ETL Job
HTML-код
- Опубликовано: 25 фев 2023
- Joining two or more datasets to create a curated dataset for a business purpose is a very common requirement one would find when building an ETL job. Learn how you can build an ETL Glue Job using AWS Glue Studio which joins two datasets, transforms the joined dataset and finally writes to the destination location.
Наука
Yes, exactly this is what I was looking for for the last few days. Thank you for making this.!!
Glad it was helpful!
Hi
Thanks for the video, Can you describe how this job was run behind the scenes and any way to control the parquet file size per block size?
I am having an issue in aws glue where aws glue is not saving the on conditions in the join. I have no idea how to fix this and could use any help.
sir can you please kindly provide the thease sample dataset you are using in the s3 bucket so we can practice on that it will be more convenient for us to practice ,and thank you for making this kind of unique and knowledge oriented videos,again thank you sir.
I can try but generally I source my data from kaggle.com. If you use this site, you find loads of sample data there including the one I use.