AmpCode
AmpCode
  • Видео 310
  • Просмотров 1 905 957
Apache Spark Streaming DStream and Window Operations | Data Engineer Full Course | Lecture 22
Welcome to the twenty-second lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll dive deeper into Spark Streaming, focusing on DStreams (Discretized Streams) and windowed operations. These concepts are fundamental for performing advanced real-time data processing tasks.
🔍 What You'll Learn:
Introduction to DStreams and their role in Spark Streaming
Performing transformations on DStreams
Understanding windowed operations and their use cases
Real-world examples of applying windowed operations to streaming data
By the end of this lecture, you’ll be equipped with the knowledge to handle advanced streaming scenarios using DStreams and windowing techniques in Spark S...
Просмотров: 98

Видео

Real Time Data Processing using Spark Streaming Made EASY! | Data Engineer Full Course | Lecture 21
Просмотров 13214 дней назад
Welcome to the twenty-first lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll explore Spark Streaming, a powerful tool for real-time data processing. This lecture simplifies the concepts of streaming and demonstrates how Spark processes live data efficiently. 🔍 What You'll Learn: What is Spark Streaming and its key features Setting up Spark Streaming for real...
Working with DataFrame and Spark SQL | Data Engineer Full Course | Lecture 20
Просмотров 22221 день назад
Welcome to the twentieth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll combine the power of DataFrames and Spark SQL to efficiently handle and analyze structured data. This session is designed to enhance your understanding of Spark's versatile data processing capabilities. 🔍 What You'll Learn: How to integrate DataFrames with Spark SQL Writing SQL queries...
Spark Optimization Techniques | Data Engineer Full Course | Lecture 19
Просмотров 174Месяц назад
Welcome to the nineteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll dive into optimization techniques in Apache Spark and PySpark, helping you enhance the performance of your big data processing tasks. Optimization is key to efficiently managing resources and processing large datasets. 🔍 What You'll Learn: The importance of optimization in Spark and P...
Apache Spark Basic DataFrame Operation | Data Engineer Full Course | Lecture 18
Просмотров 143Месяц назад
Welcome to the eighteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll focus on basic operations with Spark DataFrames. Understanding these operations is critical for manipulating and analyzing structured data effectively in Spark. 🔍 What You'll Learn: How to create DataFrames from different data sources Performing basic operations like select, filter, a...
Working with Spark SQL | Data Engineer Full Course | Lecture 17
Просмотров 266Месяц назад
Welcome to the seventeenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll explore Spark SQL, a module in Apache Spark that enables querying structured data using SQL syntax. Spark SQL is a must-know tool for data engineers working with structured data at scale. 🔍 What You'll Learn: Introduction to Spark SQL and its advantages Creating and querying DataFram...
Introduction to Spark DataFrame | Data Engineer Full Course | Lecture 16
Просмотров 169Месяц назад
Welcome to the sixteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll introduce Spark DataFrames, a powerful data abstraction in Spark that simplifies big data processing. DataFrames are essential for handling structured data, and mastering them is crucial for efficient data engineering. 🔍 What You'll Learn: What Spark DataFrames are and how they differ ...
Writing and Running Spark Application in Python | Data Engineer Full Course | Lecture 15
Просмотров 180Месяц назад
Welcome to the fifteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll cover how to write and run a Spark application in Python using PySpark. This hands-on session will help you understand the basics of Spark application development and get you started with writing your own Spark jobs. 🔍 What You'll Learn: Setting up PySpark for Spark application develop...
Apache Spark Transformations and Actions | Data Engineer Full Course | Lecture 14
Просмотров 154Месяц назад
Welcome to the fourteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll dive into transformations and actions in Apache Spark, the two main types of operations on RDDs that are key to processing data. Knowing how to use transformations and actions will allow you to build powerful data processing pipelines in Spark. 🔍 What You'll Learn: The difference betw...
Apache Spark RDD Explained | Data Engineer Full Course | Lecture 13
Просмотров 213Месяц назад
Welcome to the thirteenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll explore the concept of RDDs (Resilient Distributed Datasets) in Apache Spark. RDDs are the core abstraction in Spark, and understanding them is essential for effective big data processing. 🔍 What You'll Learn: What RDDs are and their role in Apache Spark Key properties of RDDs: immuta...
Apache Spark Architecture | Data Engineer Full Course | Lecture 12
Просмотров 2922 месяца назад
Welcome to the twelfth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll dive deep into the architecture of Apache Spark, which is key to understanding how Spark achieves its speed and scalability. Knowing Spark’s architecture will help you make the most of this powerful data processing engine. 🔍 What You'll Learn: Overview of Apache Spark's architecture The ...
Introduction to Apache Spark | Data Engineer Full Course | Lecture 11
Просмотров 2472 месяца назад
Welcome to the eleventh lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll introduce you to Apache Spark, a powerful open-source engine for big data processing. Spark’s speed and versatility make it a must-have tool for modern data engineers, and this lecture will lay the foundation for mastering Spark. 🔍 What You'll Learn: What is Apache Spark and how it diff...
Working with HDFS and running a MapReduce Job | Data Engineer Full Course | Lecture 10
Просмотров 3243 месяца назад
Welcome to the tenth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll combine our knowledge of Hadoop HDFS and MapReduce to run a MapReduce job on the Hadoop Distributed File System. This practical session will demonstrate how to work with data in HDFS and process it using MapReduce. 🔍 What You'll Learn: How to upload and manage data in HDFS Steps to configu...
Building a simple MapReduce Job | Data Engineer Full Course | Lecture 9
Просмотров 1953 месяца назад
Welcome to the ninth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we’ll take a hands-on approach to building a simple MapReduce job in Hadoop. This practical session will help you understand how MapReduce works in action and give you the foundation to build more complex data processing tasks. 🔍 What You'll Learn: Setting up the environment for building a MapRe...
Introduction to YARN in Hadoop | Data Engineer Full Course | Lecture 8
Просмотров 2393 месяца назад
Welcome to the eighth lecture of the Data Engineering Full Course series by AmpCode! 🚀 In this video, we will explore YARN (Yet Another Resource Negotiator), a fundamental component of Hadoop that manages resources in a distributed environment. Understanding YARN is essential for optimizing the performance and scalability of Hadoop clusters. 🔍 What You'll Learn: What is YARN and its role in the...
What is MapReduce in Hadoop | Data Engineer Full Course | Lecture 7
Просмотров 2013 месяца назад
What is MapReduce in Hadoop | Data Engineer Full Course | Lecture 7
Understanding Hadoop HDFS | Data Engineer Full Course | Lecture 6
Просмотров 2433 месяца назад
Understanding Hadoop HDFS | Data Engineer Full Course | Lecture 6
Install Hadoop on Windows | Data Engineer Full Course | Lecture 5
Просмотров 1,1 тыс.3 месяца назад
Install Hadoop on Windows | Data Engineer Full Course | Lecture 5
Install Apache Spark PySpark on Windows | Data Engineer Full Course | Lecture 4
Просмотров 1,3 тыс.4 месяца назад
Install Apache Spark PySpark on Windows | Data Engineer Full Course | Lecture 4
Use Cases and Scenarios for Hadoop and Spark | Data Engineer Full Course | Lecture 3
Просмотров 2404 месяца назад
Use Cases and Scenarios for Hadoop and Spark | Data Engineer Full Course | Lecture 3
Introduction to Hadoop and Spark | Data Engineer Full Course | Lecture 2
Просмотров 4044 месяца назад
Introduction to Hadoop and Spark | Data Engineer Full Course | Lecture 2
Overview of Data Engineering | Data Engineer Full Course | Lecture 1
Просмотров 5314 месяца назад
Overview of Data Engineering | Data Engineer Full Course | Lecture 1
Neo4j Cypher Aggregating Functions | Neo4j Tutorial | Lecture 12
Просмотров 5805 месяцев назад
Neo4j Cypher Aggregating Functions | Neo4j Tutorial | Lecture 12
Neo4j Cypher Scalar Functions | Neo4j Tutorial | Lecture 11
Просмотров 4895 месяцев назад
Neo4j Cypher Scalar Functions | Neo4j Tutorial | Lecture 11
Neo4j Cypher Predicate Functions | Neo4j Tutorial | Lecture 10
Просмотров 6236 месяцев назад
Neo4j Cypher Predicate Functions | Neo4j Tutorial | Lecture 10
Neo4j Cypher Values and Data Types | Neo4j Tutorial | Lecture 9
Просмотров 6196 месяцев назад
Neo4j Cypher Values and Data Types | Neo4j Tutorial | Lecture 9
Neo4j Cypher Patterns | Neo4j Tutorial | Lecture 8
Просмотров 8876 месяцев назад
Neo4j Cypher Patterns | Neo4j Tutorial | Lecture 8
Neo4j Cypher Subqueries | Neo4j Tutorial | Lecture 7
Просмотров 1,4 тыс.6 месяцев назад
Neo4j Cypher Subqueries | Neo4j Tutorial | Lecture 7
Neo4j Cypher Clauses | Neo4j Tutorial | Lecture 6
Просмотров 4,3 тыс.9 месяцев назад
Neo4j Cypher Clauses | Neo4j Tutorial | Lecture 6
Real-time vs Batch Data Processing
Просмотров 67810 месяцев назад
Real-time vs Batch Data Processing

Комментарии

  • @SundaresanC-wq3wk
    @SundaresanC-wq3wk 3 дня назад

    cannot open the repo for fedora project and ius

  • @HimanshuSingh-yj2wh
    @HimanshuSingh-yj2wh 5 дней назад

    create a playlist

  • @srisai3634
    @srisai3634 6 дней назад

    This content and examples are same as tutorials point

  • @TanishaSharma641
    @TanishaSharma641 7 дней назад

    Nice& easy explanation ...😊😊😊

  • @djjames-u1b
    @djjames-u1b 8 дней назад

    indeed video is great but while explaining try to explain by considering the points you have put on your screen otherwise its bit confusing to know which point you are talking about......

  • @FEYSALAL-RAHMANMOUCKEYTOU
    @FEYSALAL-RAHMANMOUCKEYTOU 9 дней назад

    operation_manager don't appear or simply [PATH_NOT_FOUND] Path does not exist

  • @Ohisthisyou
    @Ohisthisyou 10 дней назад

    can someone help , i have downloaded hadoop 3.3 which is the newest version but it is not showing in github . what to do ?

  • @pratiksingh5022
    @pratiksingh5022 11 дней назад

    After clicking on launch neo4j the application is not opening in my system

  • @strawberrycy
    @strawberrycy 12 дней назад

    this is a lifesaver!

  • @plutonium4574
    @plutonium4574 16 дней назад

    Ghanta .. architecture kuch samajh nhi aya

  • @Leerosasi
    @Leerosasi 16 дней назад

    I am getting this error "WARN NativeCodeLoader: Unable to load native-hadoop library for your platform" I could get the hadoop 2.7, instead I used hadoop 3.3 and its respective winutils. Advise.

  • @ranveersankpal5156
    @ranveersankpal5156 16 дней назад

    great 🤩

  • @karthikgandi1677
    @karthikgandi1677 17 дней назад

    Note - It's not ---partition, but ---partitions

  • @naren06938
    @naren06938 17 дней назад

    This type all services at single platform Nice, but any alternative for online for practice like cloud type, but for free?

  • @manikantsharma3496
    @manikantsharma3496 18 дней назад

    Puri video shkal dikhane mei nikaal di

  • @jaykadu6835
    @jaykadu6835 19 дней назад

    I want to run a workflow engine project on airflow using docker how can i do that, Do i have to do additional steps. If yes, can you provide it.

  • @ravi-y7b1d
    @ravi-y7b1d 19 дней назад

    facing the error while saving the file

  • @ravi-y7b1d
    @ravi-y7b1d 21 день назад

    i did everything exact same but for the later versions of spark 3.x java 8 or 11 is required so just download the java 8 or 11 version and it worked for me.

  • @parvadhami980
    @parvadhami980 22 дня назад

    To those who are getting error in CMD:Use "./spark-shell" Instead of just spark-shell in CMD

  • @ArunKumar-wi2et
    @ArunKumar-wi2et 22 дня назад

    Bro in jupyter throwing error

  • @prasadbarla7215
    @prasadbarla7215 22 дня назад

    spark runs only on java 8 or 11 version it doesn't work with latest version I've tried it

  • @Jayamathi-y1b
    @Jayamathi-y1b 23 дня назад

    One of the best . Thank you very much

  • @dhananjayapattnaik7428
    @dhananjayapattnaik7428 23 дня назад

    last two days onward i am struggling to install it..please help me

  • @dhananjayapattnaik7428
    @dhananjayapattnaik7428 23 дня назад

    request returned Internal Server Error for API route and version %2F%2F.%2Fpipe%2FdockerDesktopLinuxEngine/v1.47/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Ddairflow%22%3Atrue%7D%7D, check if the server supports the requested API version i am getting this issue

  • @gangatharan-x4r
    @gangatharan-x4r 24 дня назад

    very nice explained

  • @puravraj3517
    @puravraj3517 24 дня назад

    I am unable to install python and unable to install the epel https

  • @praveenguduru160
    @praveenguduru160 25 дней назад

    Great content

  • @MunniV-c1f
    @MunniV-c1f 26 дней назад

    PS C:\Users\dataeng> docker-compose up -d when i am using this command it is showing errors like---- no configuration file provided: not found

  • @Munninarendra
    @Munninarendra 27 дней назад

    sir is it possible to install airflow without docker

  • @AbhiShek-m6s
    @AbhiShek-m6s Месяц назад

    I did everything until the environment variables setup, still while using cmd spark-shell it is giving me "'spark-shell' is not recognized as an internal or external command, operable program or batch file." versions I used - For Java: java version "11.0.24" 2024-07-16 LTS Java(TM) SE Runtime Environment 18.9 (build 11.0.24+7-LTS-271) Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.24+7-LTS-271, mixed mode) For Python: Python 3.11.0rc2 For Spark: spark-3.5.3-bin-hadoop3 For Hadoop: (file from below location) winutils/hadoop-3.3.6/bin /winutils.exe

  • @SupravaMishra-e4d
    @SupravaMishra-e4d Месяц назад

    Spark-Shell is not running

  • @SupravaMishra-e4d
    @SupravaMishra-e4d Месяц назад

    I am getting errors continuously after doing the same procedure as well, please reply to me.

  • @vasutke1187
    @vasutke1187 Месяц назад

    High clarity and useful. Thanks Sir

  • @geetakavalad8983
    @geetakavalad8983 Месяц назад

    I have followed all the steps and added all the system variables but at that time winutils file was not present in my system

    • @geetakavalad8983
      @geetakavalad8983 Месяц назад

      Now I have that file how to make the changes plz let me know

  • @k-universe0022
    @k-universe0022 Месяц назад

    plz teach more about programs like how to implement these in scenario based programs

  • @k-universe0022
    @k-universe0022 Месяц назад

    your method of teaching is so good 👍

  • @rithvikramdas323
    @rithvikramdas323 Месяц назад

    Getting PY4javaerror, i have followed all the installation steps

  • @aniketrele7688
    @aniketrele7688 Месяц назад

    Do you have idea how to decrypt client side data of Mongo using spark?

  • @sigmaprideu
    @sigmaprideu Месяц назад

    Very good explanation ❤❤

  • @ameenullahsyed8526
    @ameenullahsyed8526 Месяц назад

    can I get your email address, wanted to get in touch with you

  • @udaykumar-tb5kn
    @udaykumar-tb5kn Месяц назад

    How to open Linux terminal?? Do u use Amazon Linux or how u able to enter all these

  • @Ahmmmm-y2b
    @Ahmmmm-y2b Месяц назад

    Sir i need java script

  • @Codeyug
    @Codeyug Месяц назад

    Great brother..Thanks from codeyug

  • @tejaschaudhari6424
    @tejaschaudhari6424 Месяц назад

    but what if we have installed apache spark manually? I have done this so when I am trying to execute my pyspark script in spyder it's saying no module name pyspark.

  • @naren06938
    @naren06938 Месяц назад

    Please try to make videos bit interesting Even bore theory also...you are reading PDFs continuously....iam getting sleepy

  • @rahmaesam2732
    @rahmaesam2732 Месяц назад

    still hadoop not recognize even with your installation it give you a warning message " unable to load native.hadoop library"

  • @donjuancapistrano2382
    @donjuancapistrano2382 Месяц назад

    The best video on installing payspark, even in 2024. Many thanks to the author!

    • @playtrip7528
      @playtrip7528 Месяц назад

      which spark version did u downloaded ?

    • @donjuancapistrano2382
      @donjuancapistrano2382 Месяц назад

      @playtrip7528 I downloaded 3.5.3 and pre build for Hadoop 3.3 with 3.0.0 winutils

    • @donjuancapistrano2382
      @donjuancapistrano2382 Месяц назад

      ​@@playtrip7528 I downloaded a 3.5.3 version of pyspark and 3.3 pre built for Hadoop with 3.0.0 winutils

  • @anuraggupta5665
    @anuraggupta5665 Месяц назад

    Hi @AmpCode Thanks for the great tutorial. I followed each steps and spark is working fine. But when I'm executing some of my pyspark script, I'm getting below Hadoop error: ERROR SparkContext: Error initializing SparkContext. java.lang.RuntimeException: java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. Can you please help me on this urgently.. I have set all paths as you showed in video but I'm not able to solve this error. Please Help.

  • @MOHITTHAKKAR-v6d
    @MOHITTHAKKAR-v6d Месяц назад

    hindi bolo

  • @fristname-of6ko
    @fristname-of6ko Месяц назад

    Tq so much must needed❤❤❤