How to create and run a Glue ETL Job | Transform S3 Data using AWS Glue ETL| AWS Glue ETL Pipeline

Amazon Athena Query | S3 | JSON | Amazon Athena Query for Amazon S3 Bucket JSON Raw Files

Top 5 FREE Resources to 10X Your Data Engineering Skills

AWS Project: Architect and Build an End-to-End AWS Web Application from Scratch, Step by Step

Automate your data loading from AWS S3 to Snowflake | Snowpipe

ETL | Incremental Data Load from Amazon S3 Bucket to Amazon Redshift Using AWS Glue | Datawarehouse

Cloud Data Engineering Project : Migrating an On-Premises Data Pipeline to AWS

Aymane Maghouti

Просмотров 200

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 19 авг 2024
In this video, you will see:
- Introduction: Overview of the project and the motivation behind migrating from an on-premises setup to AWS.
- On-Premises Pipeline Overview: Understanding the initial pipeline using HDFS, Spark, Kafka, PostgreSQL, and Power BI... ( details here : [ • Big Data engineering P... ])
- AWS Pipeline Architecture:
- Data Collection: Scraping data from Jumia using BeautifulSoup, transforming it with Python and pandas, and storing it in Amazon S3.
- Data Processing and Cataloging: Using AWS Glue Crawler to automatically catalog the data stored in S3.
- Data Analysis: Running SQL queries on the data using Amazon Athena.
- Data Visualization: Creating insightful visualizations with Amazon QuickSight.
- Results Storage: Storing SQL query results in an S3 bucket.
Detailed Steps Covered:
1. Setting up S3 buckets for raw and processed data.
2. Implementing web scraping with BeautifulSoup.
3. Configuring AWS Glue Crawler and Data Catalog.
4. Running data analysis queries with Amazon Athena.
5. Visualizing data with Amazon QuickSight.
#AWS #DataEngineering #CloudComputing #DataPipeline #BigData #WebScraping #AmazonS3 #AWSGlue #AmazonAthena #AmazonQuickSight #Python #BeautifulSoup

Комментарии • 1

@aymanemaghouti2065 Месяц назад ⁺¹
To see my latest project refer to : [ ruclips.net/video/MDplJJlo-y8/видео.htmlsi=yTQbgnnUXjEv7FEH ]

Следующие

Автовоспроизведение

How to create and run a Glue ETL Job | Transform S3 Data using AWS Glue ETL| AWS Glue ETL Pipeline

How to create and run a Glue ETL Job | Transform S3 Data using AWS Glue ETL| AWS Glue ETL Pipeline

Amazon Athena Query | S3 | JSON | Amazon Athena Query for Amazon S3 Bucket JSON Raw Files

Amazon Athena Query | S3 | JSON | Amazon Athena Query for Amazon S3 Bucket JSON Raw Files

Top 5 FREE Resources to 10X Your Data Engineering Skills

Top 5 FREE Resources to 10X Your Data Engineering Skills

AWS Project: Architect and Build an End-to-End AWS Web Application from Scratch, Step by Step

AWS Project: Architect and Build an End-to-End AWS Web Application from Scratch, Step by Step

Automate your data loading from AWS S3 to Snowflake | Snowpipe

Automate your data loading from AWS S3 to Snowflake | Snowpipe

ETL | Incremental Data Load from Amazon S3 Bucket to Amazon Redshift Using AWS Glue | Datawarehouse

ETL | Incremental Data Load from Amazon S3 Bucket to Amazon Redshift Using AWS Glue | Datawarehouse

God Tier Data Engineering Roadmap 2024 with End-To-End Projects

God Tier Data Engineering Roadmap 2024 with End-To-End Projects

Real-time data pipeline using kafka

Real-time data pipeline using kafka

Basic Data Engineering Project - End-To-End From Web Scraping to Tableau

Basic Data Engineering Project - End-To-End From Web Scraping to Tableau

Code along - build an ELT Pipeline in 1 Hour (dbt, Snowflake, Airflow)

Code along - build an ELT Pipeline in 1 Hour (dbt, Snowflake, Airflow)

Azure Data Factory, Azure Databricks, or Azure Synapse Analytics? When to use what.

Azure Data Factory, Azure Databricks, or Azure Synapse Analytics? When to use what.