Building a Data Lake on AWS with AWS Glue, Glue Studio, Amazon Athena, and S3

Поделиться
HTML-код
  • Опубликовано: 6 сен 2024
  • Build a simple Data Lake on AWS using a combination of services, including AWS Glue Data Catalog, AWS Glue Crawlers, AWS Glue Jobs, AWS Glue Studio, Amazon Athena, and Amazon S3.
    All open-source files on GitHub: github.com/gar....
    This video represents my own viewpoints and not of my employer, Amazon Web Services (AWS). All product names, logos, and brands are the property of their respective owners.
    📣 Please subscribe to my RUclips channel for future videos.

Комментарии • 17

  • @donaldmahaya2689
    @donaldmahaya2689 2 года назад +2

    Excellent hands on video covering building a datalake in AWS.

  • @hareepjoshi
    @hareepjoshi Год назад +1

    Hey Gary, found you through your medium articles and now I'm watching your youtube videos. Excellent content!

  • @DarioRomeroDeveloper
    @DarioRomeroDeveloper Год назад +1

    I always enjoyed the 'down to earth' practical business cases from @Gary Stafford. This one is really good. Thanks for sharing. I've learned a lot with this tutorial.

  • @mohammedgt8102
    @mohammedgt8102 2 года назад +2

    Gary, that was just perfect! Fast and straight to the point. 👏👏. Thank you!

  • @rixonmathew
    @rixonmathew 2 года назад +2

    Great video. Many complex concepts have been explained using simple language and examples.

  • @gatorpika
    @gatorpika Год назад +2

    Really great presentation, thanks for that.

  • @_truthful_q_
    @_truthful_q_ 2 года назад +1

    This was excellent 👏 Top marks my man!

  • @freakinmonkey85
    @freakinmonkey85 2 года назад +1

    Thanks a lot! This video deserves more views. It’s the first concise to the point video I’ve found where actual data and actual results are shown end to end.
    I have a question I hope you could answer. How would you handle data that changes. E.g. in a couple of days a customer cancels a ticket with id 1234, concert id 321.
    Now the calculations needs to take this into account, no?

    • @GaryStafford
      @GaryStafford  2 года назад

      ruclips.net/video/25StasmCVSw/видео.html

  • @swapnilbops1486
    @swapnilbops1486 Год назад

    Very Useful 🌟

  • @RashaadFontenot
    @RashaadFontenot 2 года назад +1

    Great video

  • @profbiyi
    @profbiyi 2 года назад +1

    Hi, Thanks so very much for this. Is it possible to do an incremental load into s3 from RDS with glue?

  • @ericpho4060
    @ericpho4060 2 года назад

    Hi, thanks for this video. Very interesting !
    Would be curious to know how would you handle incremental updates of this aggregated tables through Athena SQL queries ? with that architecture, would you run full calculation for entire set of data all over at each execution ?

    • @GaryStafford
      @GaryStafford  2 года назад +1

      Here is the documentation on the current level of integration possible with Amazon Athena (docs.aws.amazon.com/athena/latest/ug/querying-hudi.html):
      "Currently, Athena supports snapshot queries and read optimized queries, but not incremental queries. On MoR tables, all data exposed to read optimized queries are compacted. This provides good performance but does not include the latest delta commits. Snapshot queries contain the freshest data but incur some computational overhead, which makes these queries less performant."

  • @sampyism
    @sampyism Год назад

    How much did all of this cost for you for a month?