How to Run a Spark Cluster with Multiple Workers Locally Using Docker

Поделиться
HTML-код
  • Опубликовано: 18 дек 2024

Комментарии • 55

  • @АртёмМеркулов-ю3к
    @АртёмМеркулов-ю3к 15 дней назад +1

    Thank you so much for this video! Very helpful!

  • @SzTz100
    @SzTz100 3 месяца назад +3

    I haven't tried this yet, but if it works, you are a prince amongst men.

    • @thedataguygeorge
      @thedataguygeorge  2 месяца назад

      Hahahaha let me know how it goes my man!

    • @SzTz100
      @SzTz100 Месяц назад

      @@thedataguygeorge I just tried it and it worked for me, great job.

  • @josephdaquila2479
    @josephdaquila2479 4 месяца назад +2

    This also seems like a good introduction to Dockers. I am definitely getting a feel for the advantages of the tool

  • @subhashkumar209
    @subhashkumar209 2 месяца назад +1

    Well, I had been searching for a course where we can do the spark development using an IDE and run complete end to end testing and deploy to Azure Databricks. For past 3 years I didn't find anything but today I watched this video and I can say that atleast a beginning point. In case you can create more videos on Azure Databricks development locally and deployment. I assure you you will be the king

  • @rasmusandreasson1548
    @rasmusandreasson1548 9 месяцев назад +1

    The king! Thank u for good content!

  • @not_saboor
    @not_saboor 9 месяцев назад +1

    Thanks for this !

  • @Levy957
    @Levy957 9 месяцев назад +1

    i love your videos

  • @LavieAdam-qq4uf
    @LavieAdam-qq4uf 12 дней назад +1

    Can i submit multiple custom jobs to the cluster at the same time?

  • @mayowaoludoyi5425
    @mayowaoludoyi5425 8 месяцев назад +1

    Thank you for this walkthrough video. How can I establish connection to a relational database like Oracle from "dockerised" spark like this. I understand there is a different set up that requires JDBC. Where does it fits in this your setup?

    • @thedataguygeorge
      @thedataguygeorge  8 месяцев назад +2

      Hey, you would add it similarly to how I connect to snowflake in other scripts, where you use the python ODBC drivers to establish connections to relational db's like Oracle

  • @dongtandung9671
    @dongtandung9671 Месяц назад

    do you have this on a repo so that we can take a look at the whole thing?

  • @nansambassensalo3065
    @nansambassensalo3065 2 месяца назад

    Also wondering how you addressed the JAVE_HOME path setup. My error message is that it's not set.

  • @early-riser18
    @early-riser18 7 месяцев назад

    Thank you for the rundown - very helpful. Could you add a link to the code written please? Some code in the Dockerfile is hidden by the right-side screen fold and has to be guessed. Thanks :)

  • @not_saboor
    @not_saboor 7 месяцев назад +1

    Can you explain the part on Jinja Templating you mentioned in 3:40

    • @thedataguygeorge
      @thedataguygeorge  7 месяцев назад

      Sure! What specifically about it are you interested in learning more about?

  • @josephdaquila2479
    @josephdaquila2479 4 месяца назад

    So this tutorial would also help me set this up to where I'm running computations on a server?

  • @imanitrecruiterineurope4142
    @imanitrecruiterineurope4142 7 месяцев назад

    Hi!
    It seems that the applications aren't taking any resources and are stuck in a loop on my end. What could be the cause?

  • @Jalabulajunx
    @Jalabulajunx 7 месяцев назад +1

    you don't have a directory like requirement, how will req/req.txt work?

    • @thedataguygeorge
      @thedataguygeorge  7 месяцев назад

      With Spark, you'll typically initiate a spark session and provide a list of requirements you need for that particular session

  • @josephdaquila2479
    @josephdaquila2479 4 месяца назад

    So the spark workers could be more physical computers or multiple vm's?

  • @stars-and-clouds
    @stars-and-clouds 10 дней назад

    Missed the part where you need to add the spark conf file

  • @royteicher
    @royteicher 3 месяца назад

    Why all images names are 'da-spark-image' ? I get a pull access denied. This toturial is amazing and exactly what I was looking for, but can't make it happen

    • @Sudo801
      @Sudo801 2 месяца назад

      I also have this issue

    • @Sudo801
      @Sudo801 2 месяца назад +1

      So I got it working finally, even while getting the the pull access denied prompt. My issue ended up being the line "RUN curl downloads.apache.... " in the Dockerfile had an error I needed to fix for it to work.

    • @rahulgoala1991
      @rahulgoala1991 Месяц назад

      I am facing the same issue... what changes are required to make it work? Please help.

    • @АртёмМеркулов-ю3к
      @АртёмМеркулов-ю3к 15 дней назад

      @@Sudo801 Thank you for this comment!

  • @ritwikverma2463
    @ritwikverma2463 2 месяца назад

    I am not able to create .env.spark file in macbook m1, please sgare solution

  • @whramijg
    @whramijg Месяц назад

    so you came up with this all by yourself?

  • @csmithDevCove
    @csmithDevCove 9 месяцев назад +1

    What about connecting sparp-nlp to this

    • @thedataguygeorge
      @thedataguygeorge  9 месяцев назад

      You would just want to add it to be installed within the docker image!

  • @artemqqq7153
    @artemqqq7153 3 месяца назад

    Is it possible to build this without makefile? It is challenging to install it on windows...

  • @ccc_ccc789
    @ccc_ccc789 9 месяцев назад

    Thanks

  • @josephdaquila2479
    @josephdaquila2479 4 месяца назад

    Please also explain daemons vs tasks

  • @rafaellourenco4599
    @rafaellourenco4599 7 месяцев назад +2

    Bro, you skiped all the bugs stuff

    • @thedataguygeorge
      @thedataguygeorge  7 месяцев назад +1

      Sorry was solving them off camera but will make sure to show more of the troubleshooting process next time!

    • @rafaellourenco4599
      @rafaellourenco4599 7 месяцев назад +2

      @@thedataguygeorge can you share a repo with this project?

  • @Jalabulajunx
    @Jalabulajunx 7 месяцев назад

    I am always getting entry point.sh not found, has anyone figured it out?