AWS Tutorials - Access Glue Catalog using Amazon Redshift Spectrum
HTML-код
- Опубликовано: 5 фев 2022
- Amazon Redshift Spectrum is used to query data from the Amazon S3 buckets without loading the data into Amazon Redshift tables. It really helps in optimizing query cost and performance when using data split between Amazon Redshift and Amazon S3 bucket. Learn how to query data from Amazon S3 using Amazon Redshift Spectrum and Glue Data Catalog.
Наука
Great content as always!!
Great tutorial! It helped me a lot. Greetings from Brazil my friend, it's just because there are people like you in the community that I'm able to improve my skills and knowledge with more accuracy and less brain aches. Thank you!
Extremely good tutorial. Loved the way you explained Redshift Spectrum
You made this topic simple to understand. Thanks!
Good explanation of the need for Redshift Spectrum and the demo.
Thanks.
Cool Tutorial, thanks!
Glad you like it!
Thanks for this comprehensive tutorial on accessing Glue data catalog from Redshift spectrum! On this channel there is another video on accessing s3 data in Redshift using Redshift spectrum. Can you help me to understand what is the difference between these two videos since there also we are making use of Glue data catalog tables to access S3 data?
Amazing clarity on service integration
Thanks
Amazing video thank you!
Hello, thanks for the video, very informative. would love to see a video about Serverless Redshift with focus on billing and usage limits. Thanks again.
Sure - let me plan about it
Thank You Dojo Sir...... Its very nice and informative explanation....We expect some more videos on redshift from you ......
Sure. Please recommend some topics about Redshift.
Excellent tutorial.
Glad you liked it!
Thank you, it was a nice video.
Really really nice and helpful video!
Glad to hear that!
Awesome video
Thank you AWS Tutorials for the video. I was wondering do you have some info on their query performance difference: Query via Spectrum using Redshift Compute Power vs. Query via Athena?
I am not aware of any public number. In Redshift, we can control compute; hence we can increase performance if we want. In Athena, we have no control to allocate compute. Let me think, if I can do apple to apple comparison and make video about it. Thanks for the feedback.
Hi AWS Tutorials, I am still confused of when to use Spectrum. Suppose I have star schema data warehouse on Redshift, should I put fact table or dim table into Spectrum for storage? or typically data warehouse has different layers eg: landing, staging, we just put staging layer of data into Spectrum? if you can answer my question, thanks
Awesome
My query: We see a lot of, new things added to Lake Formation and Glue and lot of promotion by AWS but I do not see any good tool to consume the final data created on this Data Lake.
Athena's use case is for adhoc analysis and it has its limiations on the complex queries you can run and the number of parallel queries whcih can be run. So how do end users consume the data which is created on AWS Data Lake?
Redshift Spectrum is one way but it needs a Redshift Cluster for it, if we create only a Data Lake using S3, what is the end user consumption method/UI?
If I want to compare it with something like a DataBricks, what does AWS provide?
Please explore Amazon QuickSight. Let me know if it helps.
@@AWSTutorialsOnline Does it not proivde a dashboard and visualization experience? My query is how can Data Analysts run queries on the data to get useful information?
@@ladakshay Glue is for catalog and ETL only. For visualization, you can use QuickSight which connects seamlessly to Glue Catalog and provides self service features.
If i have added multiple csv files in single folder. Does it get the output or not???
You are working at catalog level. Catalog can handle multiple files. So in summary, Redshift will give output for the multiple files as well.
s3 is not a query engine.......???