Intro to Amazon EMR - Big Data Tutorial using Spark

Поделиться
HTML-код
  • Опубликовано: 3 фев 2025

Комментарии • 94

  • @jovelynobias5422
    @jovelynobias5422 10 месяцев назад +12

    I hope you create more videos about AWS services. Loved the way you explain things, perfect for beginners.

  • @JeffSylvan-y8j
    @JeffSylvan-y8j Год назад +6

    This is an outstanding tutorial. Thank you for making this!

  • @brijeshhota550
    @brijeshhota550 4 месяца назад +1

    Best tutorial I've seen so far. Was confused between Glue and EMR for a future projects requiring big compute power with control over each node.

  • @Munk-tt6tz
    @Munk-tt6tz 9 месяцев назад +2

    So sad your channel doesn't have more tutorials like this :( thank you so much!

  • @isaaclee3714
    @isaaclee3714 10 месяцев назад +1

    This is so goood :). Please keep making these kind of videos! Hello from Seattle

    • @jayzern
      @jayzern  10 месяцев назад

      Thanks Isaac from Seattle! Appreciate your support

  • @DarshilParmar
    @DarshilParmar 9 месяцев назад +1

    Great work mate, very crisp!

    • @jayzern
      @jayzern  9 месяцев назад

      Thanks man!! Love ur content

  • @harishchitluri3137
    @harishchitluri3137 7 месяцев назад

    Absolutely enjoyed watching the entire video. I felt this video is gonna be great start to understand EMR. Thanks for making it jay

  • @miguelhermar
    @miguelhermar 8 месяцев назад +1

    We need more videos Jaaay 🙏🏻💪🏻 You're awesome dude!

  • @vineethdas4160
    @vineethdas4160 8 месяцев назад

    awesome explanation, simple , subtle and to the point!

  • @meghasingal7082
    @meghasingal7082 3 месяца назад

    Very well explained EMR video, thank you

  • @mattcramer9187
    @mattcramer9187 18 дней назад

    Thanks for this video, useful info.

  • @yutao1982
    @yutao1982 Год назад

    Very clear! Thank you for sharing this excellent tutorial!

  • @jeahyunkim3141
    @jeahyunkim3141 5 месяцев назад

    thank you!! I watched the RUclips demo and it was really helpful. I also want to study spark on eks

  • @szhongy
    @szhongy Год назад

    great tutorial! can’t wait to see more

  • @vmmismagic
    @vmmismagic 8 месяцев назад +1

    Hey, thank you so much!!.. you really explain very well!

  • @lucashoww
    @lucashoww Год назад

    gnarly stuff man! great content.

  • @StartDataLate
    @StartDataLate 11 месяцев назад +1

    this is crazy ❤❤❤ wish i had seen this earlier ! is this how the whole amazon product in a actual work flow look like? and also could you maybe make another showing azure system? pleaaase

  • @AleLopesYTube
    @AleLopesYTube 4 месяца назад

    Excellent tutorial

  • @carloshenriquekaphos8814
    @carloshenriquekaphos8814 Год назад

    Go ahead bro....CONGRATS TUTO

  • @dianadai4616
    @dianadai4616 3 месяца назад +1

    I can see you are brilliant.

  • @prabhathkota107
    @prabhathkota107 8 месяцев назад

    Very well explained, kudos

  • @Ярослав-ю9н8з
    @Ярослав-ю9н8з Год назад

    impressive and informative video, good job, go on doing tutorials plss :) Would be very interesting to see a video about spark and snowflake on your channel!

  • @TheDataArchitect
    @TheDataArchitect 3 месяца назад

    That was FAST, you are subscribed :D
    Any vids related to "Amazon Managed Workflows for Apache Airflow"???

  • @martinghiena5270
    @martinghiena5270 10 месяцев назад

    You killed it. Loved it! Extremely useful

    • @jayzern
      @jayzern  10 месяцев назад

      Thank you man! Hope to create more

  • @markkevnjflores
    @markkevnjflores 6 месяцев назад

    this is amazing, thank you!

  • @nellyoi9831
    @nellyoi9831 3 месяца назад

    thank you, this is great tutorial

  • @goumze
    @goumze 10 месяцев назад

    Great Article ! Thanks for sharing..

  • @georgesmith9178
    @georgesmith9178 28 дней назад

    Based on this video, Amazon EMR is a GUI layer on top of open-source tech like Apache Spark, Hadoop, and so on, enabled by AWS networking infra and storage. Automation does not come out of the box - you still need to do a lot of clicking to set things up, unless of course you know how to write Terraform or CloudFormation scripts.

  • @thanhchien1602
    @thanhchien1602 Год назад

    Your video is very interesting!
    Hope you release many new videos :)

  • @jasonyuen105
    @jasonyuen105 9 месяцев назад

    nice job, great tutorial

  • @KheireddineAzzez-l3g
    @KheireddineAzzez-l3g 3 месяца назад

    nice, keep going

  • @hassanlaqrabti4036
    @hassanlaqrabti4036 Год назад +1

    More tutorials 🙏

  • @mahmoudfadaly8074
    @mahmoudfadaly8074 5 месяцев назад +2

    the type of video that makes me wanna quit the field because of how bad i feel about the level I am in , but its a very helpful video though

    • @shafiq_ramli
      @shafiq_ramli 2 месяца назад

      Are you working as data engineer right now?

    • @mahmoudfadaly8074
      @mahmoudfadaly8074 2 месяца назад

      @@shafiq_ramli I am looking for a job now as a Data Engineer

    • @mahmoudfadaly8074
      @mahmoudfadaly8074 2 месяца назад

      @@shafiq_ramliif u could help me find one 😢😅

  • @RaGaTales
    @RaGaTales 14 дней назад

    Hey! Great video. Can you also recommend alternatives for EMR that are available in the market?

  • @datexland
    @datexland Год назад

    Thanks for sharing man 👌

  • @BhavyaJoshi-r4z
    @BhavyaJoshi-r4z 7 месяцев назад

    Great video

  • @EstebanHenryG
    @EstebanHenryG Год назад

    Great!! Thank u so much!

  • @pottamvivek
    @pottamvivek 9 месяцев назад

    Great job

  • @NhungNguyen-wh7uf
    @NhungNguyen-wh7uf Год назад

    Could you share more about project for data engineer beginners? I have start to learn to be a DE recently and I hope to know more about some personal project that help me to enhance my skills. Thank you so much for your sharing and waiting for your next video :> Have a good day

  • @pradeepnim3689
    @pradeepnim3689 Год назад

    Thanks .. Good stuff

  • @moverecursus1337
    @moverecursus1337 3 месяца назад

    Great VIdeo

  • @sisami2109
    @sisami2109 Год назад

    thanks for the video

  • @atreushouse8848
    @atreushouse8848 7 месяцев назад

    bro thank you i survive

  • @errrbrrr3821
    @errrbrrr3821 Год назад +1

    great video! can you make also for AWS Glue? Thank you!

  • @Gym_ki_raa_maaya
    @Gym_ki_raa_maaya 6 дней назад

    Why did you write your code in vscode and uploaded in AWS and again you have uploaded the same code in EMR console to run?
    What is difference in both steps? and how does it work?

  • @tbd4156
    @tbd4156 Месяц назад

    💗

  • @tatenda_mk
    @tatenda_mk Год назад

    Great tutorials! thanks for the headup! do you have a git repo or more notion notes? would like some guidance

  • @_its_ck
    @_its_ck Год назад

    More videos on Streaming, Airflow and Spark

  • @bishop9168
    @bishop9168 10 месяцев назад

    Fantastic tutorial indeed! I did as instructed and I got two fails in deploying the 'add step' part of the EMR Cluster stage, any insights would be appreciated.

    • @123Bankai123
      @123Bankai123 4 месяца назад

      Hi bishop, I got this error too just now - did you manage to fix it? afaik I did everything the same

  • @tatenda_mk
    @tatenda_mk Год назад

    when writing the spark script, does it ever change or the skeleton layout remains the same? i truly appreciate this and i cannot wait for more

  • @mandata143
    @mandata143 Год назад

    is this free to use or do i need to have a licensed software in order to use? this is quite interesting.

  • @hoangng16
    @hoangng16 6 месяцев назад

    I'm learning data engineering and I find it's hard to find end-to-end data engineering projects to learn building scalable B2B data infrastructure systems that process large amounts of data. Many examples only touch the surface or handle small datasets (well, they say example). Do you have a plan to make a tutorial about actual use cases of data engineering (large and complex data, scalable, cost efficiency system, etc.)?

    • @jayzern
      @jayzern  6 месяцев назад

      Yea for sure, I’ll consider making more end to end videos in the future. Hard to mimic actual use case data engineering with toy projects, but will try to bridge that gap

    • @hoangng16
      @hoangng16 6 месяцев назад

      @@jayzern, I'm doing a project to provide a management solution for small businesses. Data such as customer appointment schedules, staff's working hours, etc., will be stored in MongoDB. I'm thinking of a data engineering component: some data will be extracted, transformed and loaded to AWS S3 for visualization on a mobile device. I think it's still simple but, perhaps, good enough

  • @jazzypants4047
    @jazzypants4047 Год назад

    I am wondering if I only needed to do PySpark, is EMR the best tool or is it overkill and Glue serverless would be good enough with a lot less to manage and fewer configurations to worry about. Is it possible to enable better performance with all the options in EMR?

    • @jazzypants4047
      @jazzypants4047 Год назад

      And thank you for this video - I’m studying for AWS certification and it was helpful to see your demonstration

  • @shivaramthallapally369
    @shivaramthallapally369 Год назад

    From where you learn that coding part 😢

  • @ZyklonB-88
    @ZyklonB-88 4 месяца назад

    why do you need to create a VPC?

    • @etf_chach
      @etf_chach 4 месяца назад

      VPC is for nodes. It allows them to communicate between each other and the master node.

  • @giovannimaia9652
    @giovannimaia9652 7 месяцев назад

    Please post more videos

  • @syedmehdi5125
    @syedmehdi5125 Год назад +2

    I hav done masters of science in biotech, 38 yers of age, want to switch to data science...how shud i do it??? Plz reply.....

    • @CK30585
      @CK30585 Год назад

      Do projects and add them in your resume. Try upwork and do some projects as freelancers. Keep applying

    • @Ved3sten
      @Ved3sten Год назад

      Don’t

    • @syedmehdi5125
      @syedmehdi5125 Год назад

      @@Ved3sten y , plz reply...

    • @Ved3sten
      @Ved3sten Год назад

      @@syedmehdi5125 bc most companies want senior data analysts or graduate students when it comes to data science. You’ll waste more money chasing a data science job than you’ll make

  • @shakendra2011
    @shakendra2011 6 месяцев назад

    Hey why did you create vpc?

    • @jayzern
      @jayzern  6 месяцев назад

      Hey, we typically create VPCs over EMR clusters for more networking control and better security. If I rmb correctly, here we defined a public subnet and internet gateway which connects to S3. You could also use private subnets to avoid attaching internet gateway, but it's trickier to setup and can incur NAT gateway charges. The video is just an example

  • @jovelynobias5422
    @jovelynobias5422 10 месяцев назад

    Isnt using EMR notebook one of of the ways to trigger EMR job?

    • @jayzern
      @jayzern  10 месяцев назад

      Yes it is! Wanted to keep things simple in the video so didn't include it

  • @vishnuchopra1127
    @vishnuchopra1127 10 дней назад

    Can you provide your pyspark code ???

  • @carloshenriquekaphos8814
    @carloshenriquekaphos8814 Год назад

    Don't stop

  • @SaurabhKrPathak
    @SaurabhKrPathak 5 месяцев назад

    Just for the information to all the learners this is not how things to be done in tech industries....you need to understand Terra form scripts along with jenkins which deploys aws services....you will not get access to go on management console and play around and do stuff.

  • @hoangng16
    @hoangng16 6 месяцев назад +1

    Your content is valuable but I find that your presentation is too fast. Sometimes, you fastforward steps too quickly. It makes the video look not lengthy but it's challenging for audience to closely follow your steps.

    • @jayzern
      @jayzern  6 месяцев назад +1

      That’s super helpful feedback, I’ll try to slow down on the steps and talk more in future videos. Hopefully my latest video is slower. How the audience feels matters a lot

  • @DivakarJ-gk6op
    @DivakarJ-gk6op Год назад

    nice try but its not working

    • @jayzern
      @jayzern  Год назад

      Let me know how I can help

    • @DivakarJ-gk6op
      @DivakarJ-gk6op Год назад

      I can add a step for the spark application@@jayzern

    • @jayzern
      @jayzern  Год назад

      Check if
      1. the Spark script is encrypted when you upload it inside S3
      2. any typos (line 41 should be "add_argument")

    • @DivakarJ-gk6op
      @DivakarJ-gk6op Год назад

      I had tried. but it's not working for me @@jayzern

    • @jayzern
      @jayzern  Год назад

      Send me a DM on instagram @jayzern or linkedin, happy to pair up

  • @koliux1
    @koliux1 11 месяцев назад

    eah good in EMR AWS but an absolute rookie in Videography and equipment use manual focus since you are stationary.... your autofocus keeps looking for something and change light set-up

    • @jayzern
      @jayzern  11 месяцев назад

      Fair point 👍 will work on lighting and camera setup more next time

  • @christinachen9669
    @christinachen9669 10 месяцев назад

    Love the ways how you demonstrate! so clear and easy to understand! Thanks for sharing @jayzern

  • @chulada03
    @chulada03 Год назад

    thanks so much