Do we have any alternate way of saving the results of a query as a table? If I download the result as a CSV and then upload it back into AWS, then all the numeric values are downloaded as strings - not possible to do any calculations on those features once you re-upload into AWS. Any workarounds?
What is the possible cause for Bucket not found error while preving the table(running the query to fetch the data). I have checked the region of S3 bucket folder and file and the permissions.(public and the file region is same as Athena). Still it says bucket not found
I had the same problem. It turned out that my issue was that I copied the full bucket URL and hadn't removed ".s3.amazonaws.com" from the bucket name. When I removed that and ended up with s3:///"/ it worked.
Hi Manuel, Nice video, Few doubts here, if you can please clarify 1.is it possible to add index for any table (say 20 columns, 200000 rows) ? since athena does'nt support, how to speed up select queries? 2. Can I zip a CSV with gzip and upload to save costs in Athena? 3. If I replace/modify the original data (only rows) in S3 , do I have to create table again? 4. If possible can you make a short video on columnar data formats like ORC?
Hi Sandeep, Thanks a lot for your comment, great to read that you like the video! Regarding your questions: 1. I do not have a specific solution right here to be honest. One general way to improve overall query performance would be working with partitions though. 2. Yes, gzip is supported by Athena and, as you wrote, is really a good approach to save cost. 3. No, if you just change the rows, you only need to run the query again but you do not have to recreate the table. 4. That’s definitely on my agenda (Apache Parquet also actually). I do not have a specific date, but I’ll try to release videos covering these topics in the next weeks.
Hi, I am facing an issue. After changing the datatype of popfemale and poptotal to decimal. SUM(popfemale)/SUM(poptotal) returns 0 or 1 instead of decimal values. Can you help.
i had the same issue, this worked for me: SELECT CAST(SUM(popfemale) as DOUBLE)/CAST(SUM(poptotal) as DOUBLE) as sharefemale, time FROM yt_population_table GROUP BY time ORDER BY time ASC source: gist.github.com/steveodom/33e0f0adc22a8cceac11b6ea1183ebec#division
Sir, you are great in explaining things.
Thank you so much for the wonderful content!!!
really great tutorial. This is the only indepth athena tutorial available on free or professional network.
Thank you so much for your amazing feedback, really happy to read that you liked the video!
Very helpful and thorough video, Manuel. You are a talented educator.
Thank you so much Charlie, it really means a lot to me to receive such a great feedback :)
Awesome as always. I've taken other classes thru udemy.com and where He has taught before, he is very clear and consise. Love his teaching.
Thank you Cassondra :)
Very nicely explained. It helped me get started with AWS Athena.
Very happy to read that you like the video Akshay, thanks a lot for sharing this :)
I Just Liked your presentation very much ..Great Job
Thank yo Suresh, happy to read that :)
Very well explained Sir.
Great work !!
Thank you very much Gaurav, really happy to read that you like the explanation :)
Great, Very detailed and perfectly explained, Thanks!
Really happy to read that Rizwan, thank you so much for your comment!
Thank you for this video series. Great explanation.
Thank YOU for your awesome feedback Ripal, really happy to read that you like it!
Hi, Could you please provide the video on how to query Postgresql data in Athena (Postgres as a data source in Athena)
Amazing! I have a question though, how do I connect my Amazon RDS database to Athena?
Please explain how to work with hive bucketing tables in AWS Athena
Nice video , great work,Thanks
Thanks so much Nilesh, really great to read that you liked the video!
Do we have any alternate way of saving the results of a query as a table? If I download the result as a CSV and then upload it back into AWS, then all the numeric values are downloaded as strings - not possible to do any calculations on those features once you re-upload into AWS. Any workarounds?
Very well explained.
Thanks a lot for this great feedback!
thanks for the great content!
What is the possible cause for Bucket not found error while preving the table(running the query to fetch the data). I have checked the region of S3 bucket folder and file and the permissions.(public and the file region is same as Athena). Still it says bucket not found
I had the same problem. It turned out that my issue was that I copied the full bucket URL and hadn't removed ".s3.amazonaws.com" from the bucket name. When I removed that and ended up with s3:///"/ it worked.
Hi Manuel,
Nice video, Few doubts here, if you can please clarify
1.is it possible to add index for any table (say 20 columns, 200000 rows) ? since athena does'nt support, how to speed up select queries?
2. Can I zip a CSV with gzip and upload to save costs in Athena?
3. If I replace/modify the original data (only rows) in S3 , do I have to create table again?
4. If possible can you make a short video on columnar data formats like ORC?
Hi Sandeep,
Thanks a lot for your comment, great to read that you like the video!
Regarding your questions:
1. I do not have a specific solution right here to be honest. One general way to improve overall query performance would be working with partitions though.
2. Yes, gzip is supported by Athena and, as you wrote, is really a good approach to save cost.
3. No, if you just change the rows, you only need to run the query again but you do not have to recreate the table.
4. That’s definitely on my agenda (Apache Parquet also actually). I do not have a specific date, but I’ll try to release videos covering these topics in the next weeks.
Academind thanks a lot for the inputs
Hi, I am facing an issue. After changing the datatype of popfemale and poptotal to decimal. SUM(popfemale)/SUM(poptotal) returns 0 or 1 instead of decimal values.
Can you help.
i had the same issue, this worked for me:
SELECT CAST(SUM(popfemale) as DOUBLE)/CAST(SUM(poptotal) as DOUBLE) as sharefemale, time
FROM yt_population_table
GROUP BY time
ORDER BY time ASC
source: gist.github.com/steveodom/33e0f0adc22a8cceac11b6ea1183ebec#division
I have a few doubts in athena. Can u give me a contact info
waiting for best practices (partitioning) !
"RITHERE!"