AWS Tutorials - Introduction to AWS Glue Studio

Поделиться
HTML-код
  • Опубликовано: 19 окт 2024

Комментарии • 42

  • @sachinamin6623
    @sachinamin6623 4 года назад +1

    One of the first AWS GLUE studio explanation which is very simple and can be followed - thank you for sharing

  • @NikMartin-I-am
    @NikMartin-I-am 4 года назад +1

    Very nice tutorial, easy to follow and understand, thank you!

  • @prakashs2150
    @prakashs2150 3 года назад +1

    Good content framed so nicely.. thanks !

  • @subhamaybhattacharyya
    @subhamaybhattacharyya 2 года назад

    Great tutorial!! Really very helpful for any AWS developer willing to learn Glue.
    Can you please create a video on AWE Data Pipeline with comparisons between the two services?

  • @ubaddala
    @ubaddala 4 года назад

    A very clean and concise video! Keep up the good work

  •  3 года назад +1

    Nice tutorial!! , what about Spark SQL Transform ?

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  3 года назад

      Apologies for the late response due to my summer break.
      I made one video about using SQL Transform in Glue Studio. Here is the link - ruclips.net/video/JoB6uarC0SE/видео.html
      Hope it helps,

  • @alokanand851
    @alokanand851 2 года назад +1

    Hi all,
    We are using AWS Glue + PySpark to perform ETL to a destination RDS PostgreSql DB. Destination tables have columns with primary & foreign keys with UUID data type. We are failing to populate these destination UUID type columns. How can we achieve this, please suggest.

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  2 года назад

      I am not sure what error you are getting. ETL job has to respect table level column constraint. As long as you are doing it; there should not be a problem.

  • @grizzlylovegrizzlylove2025
    @grizzlylovegrizzlylove2025 4 года назад +1

    Very clear. Good job. Thks a lot
    Maybe you can add some few steps to explain how the ouput data can be the consumes with Athena or/and with Quicksight.

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  4 года назад

      Hi
      Thanks for the feedback. I do have another video which talks about consuming data with Athena. Please have a look - ruclips.net/video/l5Hz2qkp4K0/видео.html
      Please let me know if you want to cover anything else from the Athena point of view. Meanwhile - made note for the QuickSight - will come back with a demo for using QuickSight with Data Lake.

    • @jamisonkhalil6752
      @jamisonkhalil6752 3 года назад

      a tip: you can watch movies on flixzone. Me and my gf have been using them for watching all kinds of movies lately.

    • @hectorotis1291
      @hectorotis1291 3 года назад

      @Jamison Khalil definitely, I've been watching on Flixzone for since december myself :)

  • @rkhadke
    @rkhadke 3 года назад +1

    Hi, I want to create glue studio connection with snowflake using any scripting language. It can be created using UI method, however want to create it using either terraform , cloudformation etc. Please help.

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  3 года назад

      to be honest - I need to check feasibility of it especially because of SnowFlake

  • @vsr1727
    @vsr1727 3 года назад +1

    Thank you

  • @LDH0507
    @LDH0507 4 года назад +1

    Could you introduce about AWS Glue Spark UI with Job and Dev Endpoint (in Sagemaker) for monitoring Spark processes??? I want to know how to make Spark history server in AWS!

  • @krishnasanagavarapu4858
    @krishnasanagavarapu4858 2 года назад

    awesome bro

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  2 года назад

      For snowflake, Glue provides custom connector. Please check this link - aws.amazon.com/blogs/big-data/performing-data-transformations-using-snowflake-and-aws-glue/
      For data steam / click stream data, you can use Amazon Kinesis or AWS MSK for data ingestion.

    • @krishnasanagavarapu4858
      @krishnasanagavarapu4858 2 года назад

      @@AWSTutorialsOnline thank you . I will check on this

  • @amitannd
    @amitannd 3 года назад +1

    Hi, I've two problems 1). when I run the crawler on an S3 bucket where i've put the data (CSV file with pipe '|' delimited) , it doesn't put the name of the column in output schema neither it asks if the 1st row is header or not. So instead of actual column name it create col0, col1, .. and so on. how to tackle this problem? 2). if a folder contain multiple csv files and different kind of data the corwler creates only one table which appends all the csv files data into one. How to control these?

    • @amitannd
      @amitannd 3 года назад

      Can I get the resolution to this??

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  3 года назад

      Hello Amit
      Thanks for the questions and apologies for the delayed response.
      1) for the first part, you need to use crawler with a custom classifier. Please check this video of mine which will help - ruclips.net/video/-3Itap4FPHI/видео.html
      2) for the second part, you might want to combine the similar files in different folders and make sure you check "Create a single schema for each S3 path." in the crawler, Please read "How to Create a Single Schema for Each Amazon S3 Include Path" section in the link here - docs.aws.amazon.com/glue/latest/dg/crawler-configuration.html?icmpid=docs_glue_console

    • @amitannd
      @amitannd 3 года назад

      @@AWSTutorialsOnline thanks sir. Let me try again

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  3 года назад

      Hi Amit, did it help?

  • @Videos-rj1ek
    @Videos-rj1ek 2 года назад

    can you please make video on moving glue code to prod using CI CD

  • @sachinamin6623
    @sachinamin6623 4 года назад +1

    if you can pls share similar videos on Redshift , MySQL using GLUE studio

  • @srinathpugalenthi7291
    @srinathpugalenthi7291 4 года назад +1

    Can we migrate informatica XML files to AWS glue studio?

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  4 года назад

      Hi, no - informatica XML files to Glue Studio migration is not there. You will have to reengineer.

  • @DineshKumar-cu3bg
    @DineshKumar-cu3bg Год назад +1

    Is there a way we write the code and it crestes a workflow on editor?

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  Год назад

      Can you please elaborate your question?

    • @DineshKumar-cu3bg
      @DineshKumar-cu3bg Год назад

      @@AWSTutorialsOnline Hi, currently a user can create a workflow using built in transformation & available connectors on the glue editor. Automatic code is also generated. My question is, using AWS cdk cloud formation, can we write such code and deploy and a glue workflow is created and we can open that workflow in editor on Aws console. ?

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  Год назад

      @@DineshKumar-cu3bg You can use CDK or CloudFormation for creating Glue Resources. Please check this - docs.aws.amazon.com/glue/latest/dg/populate-with-cloudformation-templates.html

  • @sachinamin6623
    @sachinamin6623 4 года назад +1

    is it possible to re-name target file name in S3 - right now it defaults it to run-DataSinkXXXXXXXXX

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  4 года назад +1

      It seems you cannot control file name in the target nodes. There are two ways you can resolve the name -
      1) When the file is written in S3 bucket --> raise event and call Lambda function --> the lambda function will rename the file
      2) In Glue job - use Custom Transform Node and use code like the following to write file with the name you want
      glueContext.create_dynamic_frame.from_options(
      's3',
      {'paths': ['s3://awsglue-datasets/examples/medicare/Medicare_Hospital_Provider.csv']},
      'csv',
      {'withHeader': True})