Apache Spark: Working with external data sources | PySpark Tutorial | Lecture 10

Integrating Cassandra with Apache Spark | Read/Write to Cassandra using PySpark

This INCREDIBLE trick will speed up your data processes.

SIDEMEN AMONG US MAGE ROLE: CAST A LIGHTNING STRIKE TO WIN

Raising a Grocery Store King Crab as a Pet

I 3D Printed a $1,500 Chair

How to Read and Write PySpark DataFrame | Big Data PySpark Tutorial

AmpCode

Просмотров 7 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 2 фев 2025

Комментарии • 30

@ravi-y7b1d Месяц назад
facing the error while saving the file
@AwaisBinMukhtar Год назад ⁺¹
Py4JJavaError - facing this error kindly help for this
@BOSS-AI-20 Год назад
if also not work then paste the same file in c:/windows/system32 it will work fine
@AwaisBinMukhtar Год назад
i just switched to Databricks and all of the issues resolved there . like there is no need to setup all the things and all its just like we use collab over jupyter notebook so in the same case you can use databricks for pyspark as well
@BOSS-AI-20 Год назад
@@AwaisBinMukhtar Nice
btw you're using databricks community version ?
@QuangHuy-is7jo 10 месяцев назад
@@BOSS-AI-20 Can you explain in more detail?
@villaloboscastanedagerman1171 10 месяцев назад
SAME ERROR
@villaloboscastanedagerman1171 10 месяцев назад
Py4JJavaError: An error occurred while calling o62.save.
: java.lang.UnsatisfiedLinkError: 'boolean
@sameerratnaparkhi8733 9 месяцев назад ⁺²
download hadoop.dll and set path, It fixed this issue for me.
@indianintrovert281 9 месяцев назад
@@sameerratnaparkhi8733 Thanks Bro, it worked (Saved a lot of time)
@monikashinde7227 6 месяцев назад
@@sameerratnaparkhi8733Thank you for me also it's solved the issue which I am facing from 1 week ago
@grandeur_82 8 месяцев назад
I got this error:
Py4JJavaError Traceback (most recent call last)
Cell In[20], line 5
1 output.write\
2 .format("csv").mode("overwrite")\
3 .option("path", "file:///output/op/")\
4 .partitionBy("age")\
----> 5 .save()
@patrickwheeler7107 5 месяцев назад
Curious did you ever get this figured out?
@DEMON-jg3zl 5 месяцев назад
@@patrickwheeler7107 same to me so did you?
@patrickwheeler7107 5 месяцев назад
@@DEMON-jg3zl I haven't yet. I did some research but I haven't had time to deep dive due to work...
@DEMON-jg3zl 5 месяцев назад
@@patrickwheeler7107 ahh atleast thanks to reply me back "sir" and for return I'll make sure to give you the solution to this particular problem before monday (yeah because my fckin free chat gpt quota is up today)
@patrickwheeler7107 3 месяца назад
@@DEMON-jg3zl I ended up finding out I was missing the HADOOP.DLL file in my system32 folder. You can google is and pull it down off of a GIT repo.
@sachindubey4315 Год назад
I m trying to write file but facing error " java.lang.UnsatisfiedLinkError: 'boolean org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(java.lang.String, int)'
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:793)
at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1249)
at org.apache.hadoop.fs.FileUtil.list(FileUtil.java:1454)"
any solution are there of this
@albertopedro8632 Год назад
I´VE BEEN FACED THE SAME ERROR
@BOSS-AI-20 Год назад
check my comment above
@xx-pn7it Год назад
Any solution for that
@villaloboscastanedagerman1171 10 месяцев назад
I HAVE SAME ERROR
@tossthefeathers4135 9 месяцев назад
@@villaloboscastanedagerman1171 You will need to check the compatible version of winutils.exe and hadoop.dll file for your Spark version. For e.g. for me, my spark version is 3.5.1, so for that the compatible hadoop version is 3.3.4 and lower.
We can find this compatible hadoop/winutil version for each spark version as follows: go to your Spark folder i.e SPARK_HOME location. Then in that folder open the RELEASE file using Notepad. There you will see something like this: (in my case)
Spark 3.5.1 (git revision fd86f85e181) built for Hadoop 3.3.4
Build flags: -B -Pmesos -Pyarn -Pkubernetes -Psparkr -Pscala-2.12 -Phadoop-3 -Phive -Phive-thriftserver
So you can see that for Spark 3.5.1 version, the hadoop version 3.3.4 is compatible.
So, in this case, we need to go to github.com/cdarlint/winutils/tree/master and download both files i.e hadoop.dll and winutils.exe from 3.2.2 version (as there is no folder for 3.3.4 version and next lowest is 3.2.2, so that works)
Now paste both files in the bin folder of HADOOP_HOME i.e C:/hadoop/bin/
I did the above exercise and it worked for me. I gues its because winutils.exe version is not compatible with the spark verison installed..

Следующие

Автовоспроизведение

Apache Spark: Working with external data sources | PySpark Tutorial | Lecture 10

Apache Spark: Working with external data sources | PySpark Tutorial | Lecture 10

Integrating Cassandra with Apache Spark | Read/Write to Cassandra using PySpark

Integrating Cassandra with Apache Spark | Read/Write to Cassandra using PySpark

This INCREDIBLE trick will speed up your data processes.

This INCREDIBLE trick will speed up your data processes.

SIDEMEN AMONG US MAGE ROLE: CAST A LIGHTNING STRIKE TO WIN

SIDEMEN AMONG US MAGE ROLE: CAST A LIGHTNING STRIKE TO WIN

Raising a Grocery Store King Crab as a Pet

Raising a Grocery Store King Crab as a Pet

I 3D Printed a $1,500 Chair

I 3D Printed a $1,500 Chair

The Battle Over NYC Congestion Pricing

The Battle Over NYC Congestion Pricing

6. Write DataFrame into json file using PySpark | Azure Databricks | Azure Synapse

6. Write DataFrame into json file using PySpark | Azure Databricks | Azure Synapse

Apache Spark Structured Streaming | Process Real Time Data using PySpark

Apache Spark Structured Streaming | Process Real Time Data using PySpark

Learn Apache Spark in 10 Minutes | Step by Step Guide

Learn Apache Spark in 10 Minutes | Step by Step Guide

Working with different Sources and Sinks | Building our 2nd Spark Streaming Application

Working with different Sources and Sinks | Building our 2nd Spark Streaming Application

PySpark Tutorial: Spark SQL & DataFrame Basics

PySpark Tutorial: Spark SQL & DataFrame Basics

Optimizing PySpark by setting up Spark Configuration Properties | PySpark Tutorial

Optimizing PySpark by setting up Spark Configuration Properties | PySpark Tutorial

Master Databricks and Apache Spark Step by Step: Lesson 22 - PySpark Using SQL

Master Databricks and Apache Spark Step by Step: Lesson 22 - PySpark Using SQL

How Data Engineering Works

How Data Engineering Works

LIVE: Team Vitality vs FaZe - IEM Katowice 2025

LIVE: Team Vitality vs FaZe - IEM Katowice 2025

Great idea: bushcraft mini-Burner for survival #lifehacks #survival #camping

Great idea: bushcraft mini-Burner for survival #lifehacks #survival #camping

СМЕРТЕЛЬНАЯ ИГРА в Майнкрафт [Buckshot Roulette] + Фикс, Кабан, Лололошка

СМЕРТЕЛЬНАЯ ИГРА в Майнкрафт [Buckshot Roulette] + Фикс, Кабан, Лололошка

0% Respect Moments + HIM

0% Respect Moments + HIM

Пол царства за баночку паштета. Вода и Еда. Славный Обзор.

Пол царства за баночку паштета. Вода и Еда. Славный Обзор.

РАБСТВО. Правда, о которой не принято говорить | ФАЙБ

РАБСТВО. Правда, о которой не принято говорить | ФАЙБ

КАК НА ФОТО #shorts

КАК НА ФОТО #shorts

ВСЕ УМЕРЛИ?! Поппи Плейтайм 4 #5 - Poppy Playtime Chapter 4

ВСЕ УМЕРЛИ?! Поппи Плейтайм 4 #5 - Poppy Playtime Chapter 4