Big Tech Talk
Big Tech Talk
  • Видео 71
  • Просмотров 572 508
Spark Structured Streaming With TCP Socket Sourcing | Spark Tutorial
In this video, we dive into the world of Spark Structured Streaming using a TCP socket as the data source. 🌐 We’ll start by setting up a TCP socket and configuring Spark to read streaming data from it. You’ll learn how to process real-time data streams, perform transformations, and output the results. This tutorial is perfect for data engineers and developers looking to enhance their skills in real-time data processing with Spark. By the end of this video, you’ll have a solid understanding of how to leverage Spark Structured Streaming with TCP sockets for your projects. Don’t forget to like, share, and subscribe for more tech tutorials! 🚀
Subscribe: bit.ly/2A2h6sJ
Facebook: Big...
Просмотров: 59

Видео

Introduction to Spark Structured Streaming | Spark Tutorial
Просмотров 80Месяц назад
Welcome to our introductory video on Spark Structured Streaming! 🚀 In this video, we’ll explore the fundamentals of Spark Structured Streaming, a powerful framework for real-time data processing. You’ll learn about the key differences between batch and streaming applications. Whether you’re new to streaming or looking to enhance your skills, this video is the perfect starting point. Let’s get s...
Set Up Microsoft Azure SQL Server and SQL Database on Azure Cloud | Tutorial
Просмотров 1745 месяцев назад
Set Up Microsoft Azure SQL Server and SQL Database on Azure Cloud | Tutorial
Apache Spark - Install Apache Spark 3.x On Ubuntu |Spark Tutorial
Просмотров 7 тыс.7 месяцев назад
Apache Spark - Install Apache Spark 3.x On Ubuntu |Spark Tutorial
Apache Spark - Install Apache Spark 3.x On Windows 10 |Spark Tutorial
Просмотров 21 тыс.11 месяцев назад
Apache Spark - Install Apache Spark 3.x On Windows 10 |Spark Tutorial
RabbitMQ - How to consume message from RabbitMq using Java
Просмотров 715Год назад
RabbitMQ - How to consume message from RabbitMq using Java
RabbitMQ - How to send messages to a RabbitMQ broker using Java
Просмотров 1,7 тыс.Год назад
RabbitMQ - How to send messages to a RabbitMQ broker using Java
How to Install Apache Kafka on Windows
Просмотров 20 тыс.Год назад
How to Install Apache Kafka on Windows
RabbitMQ - Creating Queue, Exchange and Binding and Publishing Message
Просмотров 6 тыс.Год назад
RabbitMQ - Creating Queue, Exchange and Binding and Publishing Message
Spark - Coalesce vs Repartition
Просмотров 1,7 тыс.Год назад
Spark - Coalesce vs Repartition
How to Install RabbitMQ Locally with Docker
Просмотров 8 тыс.Год назад
How to Install RabbitMQ Locally with Docker
Apache Spark - How to perform Spark Join on DataFrame |Inner Join | Spark Tutorial | Part 17
Просмотров 1,8 тыс.Год назад
Apache Spark - How to perform Spark Join on DataFrame |Inner Join | Spark Tutorial | Part 17
Apache Spark - How to Execute SQL query on DataFrame | Spark Tutorial | Part 16
Просмотров 4 тыс.Год назад
Apache Spark - How to Execute SQL query on DataFrame | Spark Tutorial | Part 16
Apache Spark - How to add Columns to a DataFrame using Spark & Scala | Spark Tutorial | Part 15
Просмотров 1,3 тыс.2 года назад
Apache Spark - How to add Columns to a DataFrame using Spark & Scala | Spark Tutorial | Part 15
Apache Spark - How To Rename a Columns in DataFrame using Spark & Scala | Spark Tutorial | Part 14
Просмотров 1,5 тыс.2 года назад
Apache Spark - How To Rename a Columns in DataFrame using Spark & Scala | Spark Tutorial | Part 14
Apache Spark - How To Select Columns of a Spark DataFrame using Scala | Spark Tutorial | Part 13
Просмотров 1,8 тыс.2 года назад
Apache Spark - How To Select Columns of a Spark DataFrame using Scala | Spark Tutorial | Part 13
Read data from MongoDB using Apache Spark | Spark Tutorial | Part 12
Просмотров 4,6 тыс.2 года назад
Read data from MongoDB using Apache Spark | Spark Tutorial | Part 12
How to Import CSV Data Into MongoDB(NoSql)
Просмотров 8 тыс.2 года назад
How to Import CSV Data Into MongoDB(NoSql)
How to Install MongoDB on Windows 10
Просмотров 2,5 тыс.2 года назад
How to Install MongoDB on Windows 10
Apache Spark- Dynamic Partition Pruning| Spark Tutorial | Part 11
Просмотров 2 тыс.3 года назад
Apache Spark- Dynamic Partition Pruning| Spark Tutorial | Part 11
How to Host a Website On Github
Просмотров 21 тыс.3 года назад
How to Host a Website On Github
Apache Spark- UDF ( User Defined Function )| Spark Tutorial | Part 10
Просмотров 2,1 тыс.3 года назад
Apache Spark- UDF ( User Defined Function )| Spark Tutorial | Part 10
Apache Spark - JDBC Source and Sink | Spark Tutorial | Part 9
Просмотров 6 тыс.3 года назад
Apache Spark - JDBC Source and Sink | Spark Tutorial | Part 9
Install Java 15 On Ubuntu 20.04 LTS
Просмотров 1,9 тыс.3 года назад
Install Java 15 On Ubuntu 20.04 LTS
Apache Spark - Parquet To Json Using Apache Spark and Scala | Spark Tutorial | Part 8
Просмотров 2,5 тыс.3 года назад
Apache Spark - Parquet To Json Using Apache Spark and Scala | Spark Tutorial | Part 8
Apache Spark - CSV to Parquet Using Apache Spark | Spark Tutorial | Part 7
Просмотров 9 тыс.4 года назад
Apache Spark - CSV to Parquet Using Apache Spark | Spark Tutorial | Part 7
Kibana Tutoria | Kibana Visualization - Bar Charts with Split series and Split chart
Просмотров 9 тыс.4 года назад
Kibana Tutoria | Kibana Visualization - Bar Charts with Split series and Split chart
How to Install Ubuntu 20.04 LTS on VirtualBox in Windows 10
Просмотров 5014 года назад
How to Install Ubuntu 20.04 LTS on VirtualBox in Windows 10
Apache Spark - Install Apache Spark On Ubuntu |Spark Tutorial | Part 6
Просмотров 35 тыс.4 года назад
Apache Spark - Install Apache Spark On Ubuntu |Spark Tutorial | Part 6
Apache Spark - Basics of Data Frame |Hands On| Spark Tutorial| Part 5
Просмотров 1,2 тыс.4 года назад
Apache Spark - Basics of Data Frame |Hands On| Spark Tutorial| Part 5

Комментарии

  • @rabbanishaik9342
    @rabbanishaik9342 19 часов назад

    If I close cmd rabbitmq site not working. How to fix it ?

  • @astralvolt6309
    @astralvolt6309 5 дней назад

    Not working.

    • @bigtechtalk
      @bigtechtalk 4 дня назад

      which step is not working for your ??

  • @abdumajidabdullatipov7347
    @abdumajidabdullatipov7347 9 дней назад

    thank you

  • @Bharti-q7d
    @Bharti-q7d 14 дней назад

    @Big Tech Talk , can you tell me how to handle this error . I have tried a lot to solve it but couldn't. can you look at ? error is here : 24/10/04 11:38:52 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)/ 1] org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:601) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:583) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38) at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:772) at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:749) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:514) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388) 24/10/04 11:38:52 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job Traceback (most recent call last): (0 + 0) / 1] File "<stdin>", line 1, in <module> File "C:\Program Files\spark-3.4.3\python\pyspark\sql\dataframe.py", line 901, in show print(self._jdf.showString(n, 20, vertical)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\spark-3.4.3\python\lib\py4j-0.10.9.7-src.zip\py4j\java_gateway.py", line 1322, in __call__ File "C:\Program Files\spark-3.4.3\python\pyspark\errors\exceptions\captured.py", line 169, in deco return f(*a, **kw) ^^^^^^^^^^^ File "C:\Program Files\spark-3.4.3\python\lib\py4j-0.10.9.7-src.zip\py4j\protocol.py", line 326, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o56.showString. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (DESKTOP-DARCLVU executor driver): org.apache.spark.SparkException: Python worker exited unexpectedly (crashed) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:601) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:601) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:583)

  • @thejasvenugopal
    @thejasvenugopal 16 дней назад

    i am getting java.lang.UnsupportedOperationException: getSubject is supported only if a security manager is allowed

  • @virgilioespina7556
    @virgilioespina7556 22 дня назад

    Setting default log level to "WARN". C:\Windows\System32>spark-shell Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 24/09/26 11:27:59 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 24/09/26 11:27:59 ERROR Main: Failed to initialize Spark session. java.lang.UnsupportedOperationException: getSubject is supported only if a security manager is allowed

  • @hakiki1518
    @hakiki1518 28 дней назад

    i follow your step and facing this error when check spark-shell, please help . Error: Unable to initialize main class org.apache.spark.deploy.SparkSubmit Caused by: java.lang.NoClassDefFoundError: scala/MatchError

    • @bigtechtalk
      @bigtechtalk 27 дней назад

      This is a compatibility issue. Plz check your Scala and Java version

    • @hakiki1518
      @hakiki1518 27 дней назад

      @@bigtechtalk now ok after install apachespark with scala version.

    • @hakiki1518
      @hakiki1518 27 дней назад

      @@bigtechtalk thank you

    • @bigtechtalk
      @bigtechtalk 26 дней назад

      Welcome

  • @rosassanchezedgar5343
    @rosassanchezedgar5343 29 дней назад

    muchas gracias cawn

  • @Jay-vh3gx
    @Jay-vh3gx Месяц назад

    Thanks for crystal clear explanation

  • @hiimshit
    @hiimshit Месяц назад

    stuck at vi .bashrc cannot insert spark-home, any ideas?

    • @hiimshit
      @hiimshit Месяц назад

      do not have the INSERT option like u did

    • @bigtechtalk
      @bigtechtalk Месяц назад

      You can open .bashrc file in the UI and then add spark home. or you can use echo "export SPARK_HOME='YOUR SPARK PATH' " >> .bashrc

    • @MerinNakarmi
      @MerinNakarmi 22 дня назад

      You can also use nano to edit .bashrc file. nano is so easy that vi.

    • @bigtechtalk
      @bigtechtalk 20 дней назад

      @@MerinNakarmi let me give it a try

  • @paulovolmar3769
    @paulovolmar3769 Месяц назад

    nice!

  • @rajthakur3307
    @rajthakur3307 Месяц назад

    this is for inserting data first time but how to handle insert and delete if some update in the database record

  • @woliveiras
    @woliveiras Месяц назад

    Amazing video. Thank you!

  • @somaia03
    @somaia03 2 месяца назад

    I found this error when trying to open pyspark I followed all the video steps is there anything wrong with the command? scala> SPARK_HOME/bin/pyspark <console>:23: error: not found: value SPARK_HOME SPARK_HOME/bin/pyspark ^ <console>:23: error: not found: value pyspark SPARK_HOME/bin/pyspark

    • @DarthyMaulocus
      @DarthyMaulocus 2 месяца назад

      Your environmemt varaibles are not set for Spark_Home or is incorrect

    • @DarthyMaulocus
      @DarthyMaulocus 2 месяца назад

      Most likely no path as returned no value *(null)

    • @bigtechtalk
      @bigtechtalk Месяц назад

      You are in spark shell(scala>) and then you are trying to start pyspark. This will not works. Instead of just go to spark home location and in the cmd and type bin/pyspark it will take you to pyspark shell

  • @andersontwo7110
    @andersontwo7110 3 месяца назад

    THANK THANK THANK I FROM BRAZIL

  • @MRYM-m4e
    @MRYM-m4e 3 месяца назад

    '.\bin\windows\kafka-server.bat' is not recognized as an internal or external command, operable program or batch file. I'm getting this issue. Please make a video on how to change the path variables

  • @mr.perfect7921
    @mr.perfect7921 4 месяца назад

    I dont know much about these things, but you sound like a guy that knows what he is talking about. Keep it up !!

  • @HungNguyen-hf8dq
    @HungNguyen-hf8dq 4 месяца назад

    i want build without hadoop ? you can help me plz

    • @bigtechtalk
      @bigtechtalk 4 месяца назад

      Hi @HungNguyen-hf8dq Can you tell me what exactly you wan to do.

  • @phamhung4018
    @phamhung4018 4 месяца назад

    I took time to follow another how to install Apache Kafka article but them not work such I wanted. Thank you so much for your guide, it's helped me so much because it's easy to do.

    • @bigtechtalk
      @bigtechtalk 4 месяца назад

      Hi @phamhung4018 Glad to hear that . Thanks for your comment

  • @louisizuchi1626
    @louisizuchi1626 4 месяца назад

    have done this too many time but still get error each time i try to install pyspark

    • @bigtechtalk
      @bigtechtalk 4 месяца назад

      Hi @louisizuchi1626 what the error ??

  • @madhum0723
    @madhum0723 4 месяца назад

    Good explanation thanks

  • @jimohamed
    @jimohamed 4 месяца назад

    awsome work, thanks for the detailed example

    • @bigtechtalk
      @bigtechtalk 4 месяца назад

      Thanks for your comment

  • @CalisthenicsGymTraining
    @CalisthenicsGymTraining 4 месяца назад

    It's working, thank you. In gerenal, when you precise Kafka localization, make sure that name of folder have not spaces like "Program Files". In this case throws "could not find: Files\Kafka\libs\..."

  • @pieroteran8626
    @pieroteran8626 4 месяца назад

    Very good video is Awesome! Thank you!!

    • @bigtechtalk
      @bigtechtalk 4 месяца назад

      thanks for your comment.

  • @edu_tech7594
    @edu_tech7594 5 месяцев назад

    Command 'shopt' not found, did you mean: command 'shout' from deb libshout-tools Try: apt install <deb name> Command 'shopt' not found, did you mean: command 'shout' from deb libshout-tools Try: apt install <deb name> Command 'shopt' not found, did you mean: command 'shout' from deb libshout-tools Try: apt install <deb name> Command 'shopt' not found, did you mean: command 'shout' from deb libshout-tools Try: apt install <deb name> complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found complete: command not found /usr/share/bash-completion/bash_completion:1590: parse error near `|' \[\e]0;\u@\h: \w\a\]\[\033[;94m\]┌──(\[\033[1;31m\]\u㉿\h\[\033[;94m\])-[\[\033[0;1m\]\w\[\033[;94m\]] \[\033[;94m\]└─\[\033[1;31m\]$\[\033[0m\] sudo nano +1590 /usr/share/bash-completion/bash_completion----------------this is the error i got

    • @bigtechtalk
      @bigtechtalk 5 месяцев назад

      Are u using mac ??

    • @edu_tech7594
      @edu_tech7594 5 месяцев назад

      @@bigtechtalk no i am using windows11 i am trying to install the spark in kali linux

  • @naveennandhi4182
    @naveennandhi4182 5 месяцев назад

    Showing error like the system cannot find the path specified

    • @bigtechtalk
      @bigtechtalk 5 месяцев назад

      Hi @naveennandhi4182 this error is usually when you environment path is not set correctly. Request you to check the environment variable.

  • @tragik0s
    @tragik0s 5 месяцев назад

    Hello i get this error: The filename, directory name, or volume label syntax is incorrect.

  • @nuszkat9953
    @nuszkat9953 5 месяцев назад

    Hi, many thanks for the installation steps, Could you plz confirm reason for adding "$SPARK_HOME/bin" in PATH variable twice ?

    • @quanghieuvu1012
      @quanghieuvu1012 5 месяцев назад

      Do you get the answer?

    • @nuszkat9953
      @nuszkat9953 5 месяцев назад

      @@quanghieuvu1012 no I didn't get any update yet. But I tried adding the spark path to the PATH variable once. It worked

    • @bigtechtalk
      @bigtechtalk 5 месяцев назад

      Hi @nuszkat9953 its a mistake from my side. Adding $SPARK_HOME/bin will work. Don't know how it missed during the recording. Thanks for pointing the issue.

    • @bigtechtalk
      @bigtechtalk 5 месяцев назад

      Hi @quanghieuvu1012 Adding $SPARK_HOME/bin once will work. Its a mistake that i have done while recording the video.

    • @quanghieuvu1012
      @quanghieuvu1012 5 месяцев назад

      @@bigtechtalk thanks for your reply

  • @bouakrazamira6009
    @bouakrazamira6009 5 месяцев назад

    thank uu so mush

  • @bitsforbits6581
    @bitsforbits6581 5 месяцев назад

    very well explianed without any time wasting

  • @PriyatamSingh-x9v
    @PriyatamSingh-x9v 5 месяцев назад

    I got this error on running kafka server C:\kafka>.\bin\windows\kafka-server-start.bat .\config\server.properties 'wmic' is not recognized as an internal or external command, operable program or batch file. plz help

  • @SyedAshick-bj2zz
    @SyedAshick-bj2zz 5 месяцев назад

    if i write this in pyspark or scala does the cde differ for delta lake?

    • @bigtechtalk
      @bigtechtalk 5 месяцев назад

      Hi @SyedAshick-bj2zz Code should not differ for delta lake as its just the UDF there will be change in format when you perform a write operation

  • @christianharris730
    @christianharris730 6 месяцев назад

    What do I need to change in the configuration if I want to be able to produce a message from another machine over the internet?

  • @greatvedas
    @greatvedas 6 месяцев назад

    good tutorial and to the point!

  • @belvinmudhai4041
    @belvinmudhai4041 6 месяцев назад

    I could not start the zookeeper. I am getting an error on my terminal stating the command is too long, syntax of command isn't incorrect. Is there an alternative to starting the zookeeper?

    • @pattanaleem6720
      @pattanaleem6720 2 месяца назад

      May be you have solved the issue by now ,I solved that issue by storing Kafka folder in a path which is short path eg: E:/Kafka

    • @belvinmudhai4041
      @belvinmudhai4041 2 месяца назад

      @@pattanaleem6720 tbh i gave up.. it was hard without peer review or help. Mind connecting off youtube?

  • @SashaRakoto
    @SashaRakoto 6 месяцев назад

    The most usefull video I have seen. Thanks sir !

  • @serverprogrammer
    @serverprogrammer 6 месяцев назад

    simple and clear illustration, however there is a no support for windows till this moment.

  • @ajaymgm
    @ajaymgm 6 месяцев назад

    Thanks mate. Was stuck on a thing, your video gave me clue to solve it.

  • @INISHASALLOVERIT
    @INISHASALLOVERIT 6 месяцев назад

    when i run the source ~/.bashrc i'm getting command not found error what to do to resolve this

    • @bigtechtalk
      @bigtechtalk 6 месяцев назад

      Try to locate the file in your home location with command ls -a

  • @thfields
    @thfields 6 месяцев назад

    Thank you from Brazil 🙏

    • @bigtechtalk
      @bigtechtalk 6 месяцев назад

      Hello! It's amazing to have viewers from Brazil on my channel. Thank you so much for your support!

  • @srinivassri7902
    @srinivassri7902 6 месяцев назад

    C:\Users\srinivas reddy>spark-shell 'spark-shell' is not recognized as an internal or external command, operable program or batch file.

    • @bigtechtalk
      @bigtechtalk 6 месяцев назад

      Hi @srinivassri7902 This kind of issue comes when your environment variable is not set properly. Can you have a look to you environment variable.

  • @sibess7159
    @sibess7159 6 месяцев назад

    Nothing happens when I enter the command to start Zookeeper. The cursor just goes to the next line

    • @pavan5140
      @pavan5140 6 месяцев назад

      I get the same error, did you overcome?

    • @sibess7159
      @sibess7159 6 месяцев назад

      @@pavan5140 No. Don't need it anymore

  • @Manishkr8316
    @Manishkr8316 7 месяцев назад

    Please add site links also for easy references. It's a hassle to look manually..

    • @bigtechtalk
      @bigtechtalk 7 месяцев назад

      Hi @Manishkr8316, Below is the link to webpage to install spark on ubuntu. bigtechtalk.com/install-spark-on-ubuntu/spark/

  • @DouglasNaCl2
    @DouglasNaCl2 7 месяцев назад

    Gostei bastante, direto ao ponto, parabéns pelo conteúdo!

  • @bananaboydan3642
    @bananaboydan3642 7 месяцев назад

    Great video! worked perfectly

  • @Stu88Oficial
    @Stu88Oficial 7 месяцев назад

    Ohhhh, thank youuu, you help solve my problem with kafka, now its working!

  • @fahimhossen7842
    @fahimhossen7842 7 месяцев назад

    You are amazing

  • @jurakatkov5092
    @jurakatkov5092 7 месяцев назад

    Thanks a lot <3

  • @karryjun4631
    @karryjun4631 7 месяцев назад

    Thank a lot! I succeed with your tutorial , the explanation was very detail and clear , thank a lot!

  • @eyes9716
    @eyes9716 8 месяцев назад

    help me please, this paragraph val df = spark.read.fomat("csv").option("header",true).load("D:\\SampleData\\input\\country.csv") <console>:22: error: value fomat is not a member of org.apache.spark.sql.DataFrameReader

    • @bigtechtalk
      @bigtechtalk 8 месяцев назад

      Hi @eyes9716 It should be format not fomat val df = spark.read.format("csv").option("header",true).load("D:\\SampleData\\input\\country.csv")