Installation of Apache Spark on Windows 11 (in 5 minutes)
HTML-код
- Опубликовано: 21 авг 2024
- Prerequisites of Apache Spark Installation:
1. Download Java
2. Download Python
3. Download Apache Spark
4. Download Hadoop Winutils and hadoop.dll
5. Set Environmental variables
6. Test PySpark and Spark Shell
Important Links :
github.com/kon...
spark.apache.o...
www.oracle.com...
www.python.org...
Why am I facing "The system cannot find the path specified" error?
Check environmental variable
i was able to setup spark-shell. But when i m doing spark-shell --master yarn it shows
Exception in thread "main" org.apache.spark.SparkException: When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
Is this a stable version?
I got erroe
Py4jjavaerror : python worker failed to connect back...
yes its stable version
same error
Missing Python executable 'python3', defaulting to 'C:\Users\punya\AppData\Local\Programs\Python\Python312\Scripts\..' for SPARK_HOME environment variable. Please install Python or specify the correct Python executable in PYSPARK_DRIVER_PYTHON or PYSPARK_PYTHON environment variable to detect SPARK_HOME safely.
The system cannot find the path specified.
Facing the above Error
The system cannot find the path specified. what?
Check environmental variable
Mucho ojo si alguna de las rutas delas variables del sistema que intervienen contiene algún espacio, Spark no lo soporta. Os aviso porque a mi me ha consumido 1h hasta encontrar el error.
Thanks for this video, did you skip the "hadoop" installation part?
my system says : "HADOOP_HOME" and "hadoop.home.dir" are unset
as well as "Did not find winutils.exe: java.io.FileNotFoundException: java.io.FileNotFoundException"
i followed your steps as you asked. but yeah, the spark shell worked irrespective of these warnings
you can rename WINUTILS environment variable to HADOOP_HOME.
@@UnboxingBigData But you also have HADOOP_HOME in your variables, %HADOOP_HOME% I saw in on video !? why
spark-shell : The term 'spark-shell' is not recognized as the name of a cmdlet, function, script file, or operable
program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:1
+ spark-shell
+ ~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (spark-shell:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
Check environmental variable of spark
I have followed all the steps correctly but still it is showing "The filename, directory name, or volume label syntax is incorrect." in terminal.. Paths are perfectly and correctly set.
Share screenshot
hi after above actions, i have an error, what should i do
scala> 24/06/07 11:17:11 WARN GarbageCollectionMetrics: To enable non-built-in garbage collector(s) List(G1 Concurrent GC), users should configure it(them) to spark.eventLog.gcMetrics.youngGenerationGarbageCollectors or spark.eventLog.gcMetrics.oldGenerationGarbageCollectors
Share screenshot on mail
@@UnboxingBigData is this resolved i also need help in this
I am still getting some errors ( ERROR SparkContext: Error initializing SparkContext.). can you share your mail id