AWS Glue DataBrew Demo Video For Beginners
HTML-код
- Опубликовано: 30 июл 2024
- AWS Glue DataBrew is a visual data preparation tool that makes it easy for data analysts and data scientists to clean and normalize data for analytics and machine learning by up to 80% faster. In this demo, we will give an overview of what AWS Glue DataBrew is, key capabilities, and walk you through how you could use AWS Glue DataBrew for faster data preparation.
Learn more about AWS Glue DataBrew - amzn.to/3niRQ51
Subscribe:
More AWS videos bit.ly/2O3zS75
More AWS events videos bit.ly/316g9t4
#AWS #AWSGlue #AWSGlueDataBrew #DataPreparation #DataCleaning #DataNormalization #ETL #Analytics #MachineLearning Наука
Power Query to the rescue! Lmao great work AWS ;)
Amazing tool.
how can we configure char-set for input file? It always garbles :(
You should add all the pandas functions. I've just tested Databrew and I can't convert nanoseconds into the DateTime format. In pandas it's so simple, with DataBrew I can't... It's not saving me any time.
What it would be great is to be able to export the steps as Python code
Check out Dropbase. It lets you export data processing steps as Python code
@@jimmyechan I known there's tools that do it. I was just providing some tools to AWS for improvement
What about unstructured data?
What kinds of unstructured data are you thinking about?
@@SurbhiDangi xml or json from scraping web sites etc. It appears every dataset has to be a table in glue.
@@edwinthatsnotmyname3670 Agree, the tool supports JSON (+ nested JSON) today! Also, there is a direct S3 connector. You can also upload a file from your local disk, if you'd like. Take a look at the 3rd screenshot here: aws.amazon.com/blogs/aws/announcing-aws-glue-databrew-a-visual-data-preparation-tool-that-helps-you-clean-and-normalize-data-faster/
Just in case this helps: 1. Create a Glue crawler to run on your unstructured (e.g. JSON) data (if the structure is complex, like highly nested documents, you can Grok patterns or manually do it in Athena (for example: create table... column struct