AWS Tutorials - Access Glue Catalog using Amazon Redshift Spectrum

Поделиться
HTML-код
  • Опубликовано: 5 фев 2022
  • Amazon Redshift Spectrum is used to query data from the Amazon S3 buckets without loading the data into Amazon Redshift tables. It really helps in optimizing query cost and performance when using data split between Amazon Redshift and Amazon S3 bucket. Learn how to query data from Amazon S3 using Amazon Redshift Spectrum and Glue Data Catalog.
  • НаукаНаука

Комментарии • 33

  • @FaniHabtes
    @FaniHabtes 3 месяца назад

    Great content as always!!

  • @ricardoevers5496
    @ricardoevers5496 2 года назад +2

    Great tutorial! It helped me a lot. Greetings from Brazil my friend, it's just because there are people like you in the community that I'm able to improve my skills and knowledge with more accuracy and less brain aches. Thank you!

  • @mohankewlani1357
    @mohankewlani1357 Год назад +1

    Extremely good tutorial. Loved the way you explained Redshift Spectrum

  • @michaelrobinson9112
    @michaelrobinson9112 10 месяцев назад

    You made this topic simple to understand. Thanks!

  • @ladakshay
    @ladakshay 2 года назад +1

    Good explanation of the need for Redshift Spectrum and the demo.

  • @TimoKraus-sg2gk
    @TimoKraus-sg2gk Год назад

    Cool Tutorial, thanks!

  • @swagatamchakraborty3352
    @swagatamchakraborty3352 2 года назад

    Thanks for this comprehensive tutorial on accessing Glue data catalog from Redshift spectrum! On this channel there is another video on accessing s3 data in Redshift using Redshift spectrum. Can you help me to understand what is the difference between these two videos since there also we are making use of Glue data catalog tables to access S3 data?

  • @ankur1jain2soft
    @ankur1jain2soft 2 года назад

    Amazing clarity on service integration

  • @rishadm1771
    @rishadm1771 Год назад

    Amazing video thank you!

  • @aksharjamgaonkar3672
    @aksharjamgaonkar3672 Год назад +1

    Hello, thanks for the video, very informative. would love to see a video about Serverless Redshift with focus on billing and usage limits. Thanks again.

  • @pradnyavantkalamkar6936
    @pradnyavantkalamkar6936 2 года назад

    Thank You Dojo Sir...... Its very nice and informative explanation....We expect some more videos on redshift from you ......

  • @sabyabera
    @sabyabera 2 года назад +1

    Excellent tutorial.

  • @MathSquad_24
    @MathSquad_24 2 года назад

    Thank you, it was a nice video.

  • @priyanshubansal5517
    @priyanshubansal5517 2 года назад +1

    Really really nice and helpful video!

  • @prannoyroy5312
    @prannoyroy5312 Год назад

    Awesome video

  • @hsz7338
    @hsz7338 2 года назад

    Thank you AWS Tutorials for the video. I was wondering do you have some info on their query performance difference: Query via Spectrum using Redshift Compute Power vs. Query via Athena?

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  2 года назад +1

      I am not aware of any public number. In Redshift, we can control compute; hence we can increase performance if we want. In Athena, we have no control to allocate compute. Let me think, if I can do apple to apple comparison and make video about it. Thanks for the feedback.

  • @ryany420
    @ryany420 Год назад

    Hi AWS Tutorials, I am still confused of when to use Spectrum. Suppose I have star schema data warehouse on Redshift, should I put fact table or dim table into Spectrum for storage? or typically data warehouse has different layers eg: landing, staging, we just put staging layer of data into Spectrum? if you can answer my question, thanks

  • @harshvardhan1156
    @harshvardhan1156 Год назад

    Awesome

  • @ladakshay
    @ladakshay 2 года назад +1

    My query: We see a lot of, new things added to Lake Formation and Glue and lot of promotion by AWS but I do not see any good tool to consume the final data created on this Data Lake.
    Athena's use case is for adhoc analysis and it has its limiations on the complex queries you can run and the number of parallel queries whcih can be run. So how do end users consume the data which is created on AWS Data Lake?
    Redshift Spectrum is one way but it needs a Redshift Cluster for it, if we create only a Data Lake using S3, what is the end user consumption method/UI?
    If I want to compare it with something like a DataBricks, what does AWS provide?

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  2 года назад +1

      Please explore Amazon QuickSight. Let me know if it helps.

    • @ladakshay
      @ladakshay 2 года назад

      @@AWSTutorialsOnline Does it not proivde a dashboard and visualization experience? My query is how can Data Analysts run queries on the data to get useful information?

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  2 года назад

      @@ladakshay Glue is for catalog and ETL only. For visualization, you can use QuickSight which connects seamlessly to Glue Catalog and provides self service features.

  • @shankarmendyala9167
    @shankarmendyala9167 Год назад

    If i have added multiple csv files in single folder. Does it get the output or not???

    • @AWSTutorialsOnline
      @AWSTutorialsOnline  Год назад

      You are working at catalog level. Catalog can handle multiple files. So in summary, Redshift will give output for the multiple files as well.

  • @RichardPeterShon
    @RichardPeterShon 9 месяцев назад

    s3 is not a query engine.......???