- Видео 71
- Просмотров 572 508
Big Tech Talk
Индия
Добавлен 31 авг 2015
We at Big Tech Talk try to cover topics in Big Data and various new technologies in the market. As part of this channel, we intend to bring videos related to big data, hadoop, spark,no sql databases, hbase, cassandra, etc.
You can get more info about latest tech on our blog at www.bigtechtalk.com/
Connect with us at : BigTechTalk/
Code at: github.com/BigTechTalk
Support me at : ko-fi.com/bigtechtalk
You can get more info about latest tech on our blog at www.bigtechtalk.com/
Connect with us at : BigTechTalk/
Code at: github.com/BigTechTalk
Support me at : ko-fi.com/bigtechtalk
Spark Structured Streaming With TCP Socket Sourcing | Spark Tutorial
In this video, we dive into the world of Spark Structured Streaming using a TCP socket as the data source. 🌐 We’ll start by setting up a TCP socket and configuring Spark to read streaming data from it. You’ll learn how to process real-time data streams, perform transformations, and output the results. This tutorial is perfect for data engineers and developers looking to enhance their skills in real-time data processing with Spark. By the end of this video, you’ll have a solid understanding of how to leverage Spark Structured Streaming with TCP sockets for your projects. Don’t forget to like, share, and subscribe for more tech tutorials! 🚀
Subscribe: bit.ly/2A2h6sJ
Facebook: Big...
Subscribe: bit.ly/2A2h6sJ
Facebook: Big...
Просмотров: 59
Видео
Introduction to Spark Structured Streaming | Spark Tutorial
Просмотров 80Месяц назад
Welcome to our introductory video on Spark Structured Streaming! 🚀 In this video, we’ll explore the fundamentals of Spark Structured Streaming, a powerful framework for real-time data processing. You’ll learn about the key differences between batch and streaming applications. Whether you’re new to streaming or looking to enhance your skills, this video is the perfect starting point. Let’s get s...
Set Up Microsoft Azure SQL Server and SQL Database on Azure Cloud | Tutorial
Просмотров 1745 месяцев назад
Set Up Microsoft Azure SQL Server and SQL Database on Azure Cloud | Tutorial
Apache Spark - Install Apache Spark 3.x On Ubuntu |Spark Tutorial
Просмотров 7 тыс.7 месяцев назад
Apache Spark - Install Apache Spark 3.x On Ubuntu |Spark Tutorial
Apache Spark - Install Apache Spark 3.x On Windows 10 |Spark Tutorial
Просмотров 21 тыс.11 месяцев назад
Apache Spark - Install Apache Spark 3.x On Windows 10 |Spark Tutorial
RabbitMQ - How to consume message from RabbitMq using Java
Просмотров 715Год назад
RabbitMQ - How to consume message from RabbitMq using Java
RabbitMQ - How to send messages to a RabbitMQ broker using Java
Просмотров 1,7 тыс.Год назад
RabbitMQ - How to send messages to a RabbitMQ broker using Java
How to Install Apache Kafka on Windows
Просмотров 20 тыс.Год назад
How to Install Apache Kafka on Windows
RabbitMQ - Creating Queue, Exchange and Binding and Publishing Message
Просмотров 6 тыс.Год назад
RabbitMQ - Creating Queue, Exchange and Binding and Publishing Message
How to Install RabbitMQ Locally with Docker
Просмотров 8 тыс.Год назад
How to Install RabbitMQ Locally with Docker
Apache Spark - How to perform Spark Join on DataFrame |Inner Join | Spark Tutorial | Part 17
Просмотров 1,8 тыс.Год назад
Apache Spark - How to perform Spark Join on DataFrame |Inner Join | Spark Tutorial | Part 17
Apache Spark - How to Execute SQL query on DataFrame | Spark Tutorial | Part 16
Просмотров 4 тыс.Год назад
Apache Spark - How to Execute SQL query on DataFrame | Spark Tutorial | Part 16
Apache Spark - How to add Columns to a DataFrame using Spark & Scala | Spark Tutorial | Part 15
Просмотров 1,3 тыс.2 года назад
Apache Spark - How to add Columns to a DataFrame using Spark & Scala | Spark Tutorial | Part 15
Apache Spark - How To Rename a Columns in DataFrame using Spark & Scala | Spark Tutorial | Part 14
Просмотров 1,5 тыс.2 года назад
Apache Spark - How To Rename a Columns in DataFrame using Spark & Scala | Spark Tutorial | Part 14
Apache Spark - How To Select Columns of a Spark DataFrame using Scala | Spark Tutorial | Part 13
Просмотров 1,8 тыс.2 года назад
Apache Spark - How To Select Columns of a Spark DataFrame using Scala | Spark Tutorial | Part 13
Read data from MongoDB using Apache Spark | Spark Tutorial | Part 12
Просмотров 4,6 тыс.2 года назад
Read data from MongoDB using Apache Spark | Spark Tutorial | Part 12
How to Import CSV Data Into MongoDB(NoSql)
Просмотров 8 тыс.2 года назад
How to Import CSV Data Into MongoDB(NoSql)
How to Install MongoDB on Windows 10
Просмотров 2,5 тыс.2 года назад
How to Install MongoDB on Windows 10
Apache Spark- Dynamic Partition Pruning| Spark Tutorial | Part 11
Просмотров 2 тыс.3 года назад
Apache Spark- Dynamic Partition Pruning| Spark Tutorial | Part 11
Apache Spark- UDF ( User Defined Function )| Spark Tutorial | Part 10
Просмотров 2,1 тыс.3 года назад
Apache Spark- UDF ( User Defined Function )| Spark Tutorial | Part 10
Apache Spark - JDBC Source and Sink | Spark Tutorial | Part 9
Просмотров 6 тыс.3 года назад
Apache Spark - JDBC Source and Sink | Spark Tutorial | Part 9
Install Java 15 On Ubuntu 20.04 LTS
Просмотров 1,9 тыс.3 года назад
Install Java 15 On Ubuntu 20.04 LTS
Apache Spark - Parquet To Json Using Apache Spark and Scala | Spark Tutorial | Part 8
Просмотров 2,5 тыс.3 года назад
Apache Spark - Parquet To Json Using Apache Spark and Scala | Spark Tutorial | Part 8
Apache Spark - CSV to Parquet Using Apache Spark | Spark Tutorial | Part 7
Просмотров 9 тыс.4 года назад
Apache Spark - CSV to Parquet Using Apache Spark | Spark Tutorial | Part 7
Kibana Tutoria | Kibana Visualization - Bar Charts with Split series and Split chart
Просмотров 9 тыс.4 года назад
Kibana Tutoria | Kibana Visualization - Bar Charts with Split series and Split chart
How to Install Ubuntu 20.04 LTS on VirtualBox in Windows 10
Просмотров 5014 года назад
How to Install Ubuntu 20.04 LTS on VirtualBox in Windows 10
Apache Spark - Install Apache Spark On Ubuntu |Spark Tutorial | Part 6
Просмотров 35 тыс.4 года назад
Apache Spark - Install Apache Spark On Ubuntu |Spark Tutorial | Part 6
Apache Spark - Basics of Data Frame |Hands On| Spark Tutorial| Part 5
Просмотров 1,2 тыс.4 года назад
Apache Spark - Basics of Data Frame |Hands On| Spark Tutorial| Part 5
If I close cmd rabbitmq site not working. How to fix it ?
Not working.
which step is not working for your ??
thank you
You're welcome
@Big Tech Talk , can you tell me how to handle this error . I have tried a lot to solve it but couldn't. can you look at ? error is here : 24/10/04 11:38:52 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)/ 1] org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:601) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:583) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38) at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:772) at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:749) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:514) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388) 24/10/04 11:38:52 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job Traceback (most recent call last): (0 + 0) / 1] File "<stdin>", line 1, in <module> File "C:\Program Files\spark-3.4.3\python\pyspark\sql\dataframe.py", line 901, in show print(self._jdf.showString(n, 20, vertical)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\spark-3.4.3\python\lib\py4j-0.10.9.7-src.zip\py4j\java_gateway.py", line 1322, in __call__ File "C:\Program Files\spark-3.4.3\python\pyspark\errors\exceptions\captured.py", line 169, in deco return f(*a, **kw) ^^^^^^^^^^^ File "C:\Program Files\spark-3.4.3\python\lib\py4j-0.10.9.7-src.zip\py4j\protocol.py", line 326, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o56.showString. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (DESKTOP-DARCLVU executor driver): org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:601) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:601) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:583)
i am getting java.lang.UnsupportedOperationException: getSubject is supported only if a security manager is allowed
Setting default log level to "WARN". C:\Windows\System32>spark-shell Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 24/09/26 11:27:59 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 24/09/26 11:27:59 ERROR Main: Failed to initialize Spark session. java.lang.UnsupportedOperationException: getSubject is supported only if a security manager is allowed
i follow your step and facing this error when check spark-shell, please help . Error: Unable to initialize main class org.apache.spark.deploy.SparkSubmit Caused by: java.lang.NoClassDefFoundError: scala/MatchError
This is a compatibility issue. Plz check your Scala and Java version
@@bigtechtalk now ok after install apachespark with scala version.
@@bigtechtalk thank you
Welcome
muchas gracias cawn
Thanks for crystal clear explanation
thanks
stuck at vi .bashrc cannot insert spark-home, any ideas?
do not have the INSERT option like u did
You can open .bashrc file in the UI and then add spark home. or you can use echo "export SPARK_HOME='YOUR SPARK PATH' " >> .bashrc
You can also use nano to edit .bashrc file. nano is so easy that vi.
@@MerinNakarmi let me give it a try
nice!
Thanks!
this is for inserting data first time but how to handle insert and delete if some update in the database record
Amazing video. Thank you!
Glad you liked it!
I found this error when trying to open pyspark I followed all the video steps is there anything wrong with the command? scala> SPARK_HOME/bin/pyspark <console>:23: error: not found: value SPARK_HOME SPARK_HOME/bin/pyspark ^ <console>:23: error: not found: value pyspark SPARK_HOME/bin/pyspark
Your environmemt varaibles are not set for Spark_Home or is incorrect
Most likely no path as returned no value *(null)
You are in spark shell(scala>) and then you are trying to start pyspark. This will not works. Instead of just go to spark home location and in the cmd and type bin/pyspark it will take you to pyspark shell
THANK THANK THANK I FROM BRAZIL
thanks
'.\bin\windows\kafka-server.bat' is not recognized as an internal or external command, operable program or batch file. I'm getting this issue. Please make a video on how to change the path variables
I dont know much about these things, but you sound like a guy that knows what he is talking about. Keep it up !!
Thanks
i want build without hadoop ? you can help me plz
Hi @HungNguyen-hf8dq Can you tell me what exactly you wan to do.
I took time to follow another how to install Apache Kafka article but them not work such I wanted. Thank you so much for your guide, it's helped me so much because it's easy to do.
Hi @phamhung4018 Glad to hear that . Thanks for your comment
have done this too many time but still get error each time i try to install pyspark
Hi @louisizuchi1626 what the error ??
Good explanation thanks
thanks @maghum0723
awsome work, thanks for the detailed example
Thanks for your comment
It's working, thank you. In gerenal, when you precise Kafka localization, make sure that name of folder have not spaces like "Program Files". In this case throws "could not find: Files\Kafka\libs\..."
Very good video is Awesome! Thank you!!
thanks for your comment.
Command 'shopt' not found, did you mean: command 'shout' from deb libshout-tools Try: apt install <deb name> Command 'shopt' not found, did you mean: command 'shout' from deb libshout-tools Try: apt install <deb name> Command 'shopt' not found, did you mean: command 'shout' from deb libshout-tools Try: apt install <deb name> Command 'shopt' not found, did you mean: command 'shout' from deb libshout-tools Try: apt install <deb name> complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found /usr/share/bash-completion/bash_completion:1590: parse error near `|' \[\e]0;\u@\h: \w\a\]\[\033[;94m\]┌──(\[\033[1;31m\]\u㉿\h\[\033[;94m\])-[\[\033[0;1m\]\w\[\033[;94m\]] \[\033[;94m\]└─\[\033[1;31m\]$\[\033[0m\] sudo nano +1590 /usr/share/bash-completion/bash_completion----------------this is the error i got
Are u using mac ??
@@bigtechtalk no i am using windows11 i am trying to install the spark in kali linux
Showing error like the system cannot find the path specified
Hi @naveennandhi4182 this error is usually when you environment path is not set correctly. Request you to check the environment variable.
Hello i get this error: The filename, directory name, or volume label syntax is incorrect.
Hi, many thanks for the installation steps, Could you plz confirm reason for adding "$SPARK_HOME/bin" in PATH variable twice ?
Do you get the answer?
@@quanghieuvu1012 no I didn't get any update yet. But I tried adding the spark path to the PATH variable once. It worked
Hi @nuszkat9953 its a mistake from my side. Adding $SPARK_HOME/bin will work. Don't know how it missed during the recording. Thanks for pointing the issue.
Hi @quanghieuvu1012 Adding $SPARK_HOME/bin once will work. Its a mistake that i have done while recording the video.
@@bigtechtalk thanks for your reply
thank uu so mush
You're welcome!
very well explianed without any time wasting
Thanks
I got this error on running kafka server C:\kafka>.\bin\windows\kafka-server-start.bat .\config\server.properties 'wmic' is not recognized as an internal or external command, operable program or batch file. plz help
if i write this in pyspark or scala does the cde differ for delta lake?
Hi @SyedAshick-bj2zz Code should not differ for delta lake as its just the UDF there will be change in format when you perform a write operation
What do I need to change in the configuration if I want to be able to produce a message from another machine over the internet?
good tutorial and to the point!
thanks
I could not start the zookeeper. I am getting an error on my terminal stating the command is too long, syntax of command isn't incorrect. Is there an alternative to starting the zookeeper?
May be you have solved the issue by now ,I solved that issue by storing Kafka folder in a path which is short path eg: E:/Kafka
@@pattanaleem6720 tbh i gave up.. it was hard without peer review or help. Mind connecting off youtube?
The most usefull video I have seen. Thanks sir !
You are welcome
simple and clear illustration, however there is a no support for windows till this moment.
Thanks mate. Was stuck on a thing, your video gave me clue to solve it.
Great to hear!
when i run the source ~/.bashrc i'm getting command not found error what to do to resolve this
Try to locate the file in your home location with command ls -a
Thank you from Brazil 🙏
Hello! It's amazing to have viewers from Brazil on my channel. Thank you so much for your support!
C:\Users\srinivas reddy>spark-shell 'spark-shell' is not recognized as an internal or external command, operable program or batch file.
Hi @srinivassri7902 This kind of issue comes when your environment variable is not set properly. Can you have a look to you environment variable.
Nothing happens when I enter the command to start Zookeeper. The cursor just goes to the next line
I get the same error, did you overcome?
@@pavan5140 No. Don't need it anymore
Please add site links also for easy references. It's a hassle to look manually..
Hi @Manishkr8316, Below is the link to webpage to install spark on ubuntu. bigtechtalk.com/install-spark-on-ubuntu/spark/
Gostei bastante, direto ao ponto, parabéns pelo conteúdo!
thanks
Great video! worked perfectly
Great to hear!
Ohhhh, thank youuu, you help solve my problem with kafka, now its working!
Welcome
You are amazing
thanks
Thanks a lot <3
You're welcome!
Thank a lot! I succeed with your tutorial , the explanation was very detail and clear , thank a lot!
Great to hear!
help me please, this paragraph val df = spark.read.fomat("csv").option("header",true).load("D:\\SampleData\\input\\country.csv") <console>:22: error: value fomat is not a member of org.apache.spark.sql.DataFrameReader
Hi @eyes9716 It should be format not fomat val df = spark.read.format("csv").option("header",true).load("D:\\SampleData\\input\\country.csv")