Very informative!As EMR File system provides provisioning s3 services. Can I use EMR FS as the storage and on top of it running spark for handling big data? Is it possible to replace hadoop with s3?
can we use a configuration of 1 master node, 1 core node and all the rest with task nodes (these would be spot based)? is this setup ok for a transient (~2h) workload? I have noticed that the core node is used as drive when running on cluster mode, this is the reason I believe there must be a least 1 core node on demand. Is this affirmation valid? What you would suggest?
Am quite intrigued about the small file problem.. .say thr is a table and has 10 recs and we the convert them we create a parquet ifile..whr.woukd it cause problems? While say external table creation or while we run any inquiry or during any analytics job performing complex calculation...
Hi, there. 👋 NoSQL databases store data differently than relational databases, the data schema is flexible, allowing for easier scaling opportunities. You head over to this link for more info: go.aws/3zpKjge. 🔗 For further questions, we suggest reaching out to our helpful community of specialists at re:Post: go.aws/aws-repost. 📝 ^ZP
@@awssupport yes I am aware what a NoSQL database is, just wasnt sure why that was being compared to the other items on that slide. It's not a big deal but the label of "transactional database" was probably more apt.
thanks for demo and presentation, it's rare to find such a professionally structured materials!
Happy to help you with the best resources! 🤝 ❤️ 😎
This is an excellent presentation providing concise depth on a number of critical features
great video. Can we get that Step Functions code ?
Great Video. To the point, clear and very helpful. THANK YOU!
Very informative!As EMR File system provides provisioning s3 services. Can I use EMR FS as the storage and on top of it running spark for handling big data? Is it possible to replace hadoop with s3?
can we use a configuration of 1 master node, 1 core node and all the rest with task nodes (these would be spot based)? is this setup ok for a transient (~2h) workload? I have noticed that the core node is used as drive when running on cluster mode, this is the reason I believe there must be a least 1 core node on demand. Is this affirmation valid? What you would suggest?
Am quite intrigued about the small file problem.. .say thr is a table and has 10 recs and we the convert them we create a parquet ifile..whr.woukd it cause problems? While say external table creation or while we run any inquiry or during any analytics job performing complex calculation...
Fantastic demo and the presentation is rich and cool one ..... its more helpful to design productioniged solutions
This is amazing, appreciate it. :)
In scaling comparison time scale is different for both of the graphs(12h vs 14 days), I believe the graph will be much smoother in higher time frames.
Rich content. Thank you!
brilliant
Is the step function code open sourced?
AWESOME
whats meant by NoSQL @ @13.33?
Hi, there. 👋 NoSQL databases store data differently than relational databases, the data schema is flexible, allowing for easier scaling opportunities. You head over to this link for more info: go.aws/3zpKjge. 🔗 For further questions, we suggest reaching out to our helpful community of specialists at re:Post: go.aws/aws-repost. 📝 ^ZP
@@awssupport yes I am aware what a NoSQL database is, just wasnt sure why that was being compared to the other items on that slide. It's not a big deal but the label of "transactional database" was probably more apt.
great presentation, covering key features, well explained and showing demos as well.. thanks very much ; )
This is a marvel. I read a book with similar content, and it was a marvel to behold. "Mastering AWS: A Software Engineers Guide" by Nathan Vale
Diddo on the step functions code. Great video though!
A lot of information for sure but not explained well to actually understand it
text