Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training | Edureka

edureka!

Просмотров 639 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 сен 2024
🔥 Apache Spark Training (Use Code "𝐘𝐎𝐔𝐓𝐔𝐁𝐄𝟐𝟎"): www.edureka.co...
This Edureka Spark Tutorial (Spark Blog Series: goo.gl/WrEKX9) will help you to understand all the basics of Apache Spark. This Spark tutorial is ideal for both beginners as well as professionals who want to learn or brush up Apache Spark concepts. Below are the topics covered in this tutorial:
02:13 Big Data Introduction
13:02 Batch vs Real Time Analytics
1:00:02 What is Apache Spark?
1:01:16 Why Apache Spark?
1:03:27 Using Spark with Hadoop
1:06:37 Apache Spark Features
1:14:58 Apache Spark Ecosystem
1:18:01 Brief introduction to complete Spark Ecosystem Stack
1:40:24 Demo: Earthquake Detection Using Apache Spark
Subscribe to our channel to get video updates. Hit the subscribe button above.
PG in Big Data Engineering with NIT Rourkela : www.edureka.co... (450+ Hrs || 9 Months || 20+ Projects & 100+ Case studies)
#edureka #edurekaSpark #SparkTutorial #SparkOnlineTraining
Check our complete Apache Spark and Scala playlist here: goo.gl/ViRJ2K
How it Works?
1. This is a 4 Week Instructor led Online Course, 32 hours of assignment and 20 hours of project work
2. We have a 24x7 One-on-One LIVE Technical Support to help you with any problems you might face or any clarifications you may require during the course.
3. At the end of the training you will have to work on a project, based on which we will provide you a Grade and a Verifiable Certificate!
- - - - - - - - - - - - - -
About the Course
This Spark training will enable learners to understand how Spark executes in-memory data processing and runs much faster than Hadoop MapReduce. Learners will master Scala programming and will get trained on different APIs which Spark offers such as Spark Streaming, Spark SQL, Spark RDD, Spark MLlib and Spark GraphX. This Edureka course is an integral part of Big Data developer's learning path.
After completing the Apache Spark and Scala training, you will be able to:
1) Understand Scala and its implementation
2) Master the concepts of Traits and OOPS in Scala programming
3) Install Spark and implement Spark operations on Spark Shell
4) Understand the role of Spark RDD
5) Implement Spark applications on YARN (Hadoop)
6) Learn Spark Streaming API
7) Implement machine learning algorithms in Spark MLlib API
8) Analyze Hive and Spark SQL architecture
9) Understand Spark GraphX API and implement graph algorithms
10) Implement Broadcast variable and Accumulators for performance tuning
11) Spark Real-time Projects
- - - - - - - - - - - - - -
Who should go for this Course?
This course is a must for anyone who aspires to embark into the field of big data and keep abreast of the latest developments around fast and efficient processing of ever-growing data using Spark and related projects. The course is ideal for:
1. Big Data enthusiasts
2. Software Architects, Engineers and Developers
3. Data Scientists and Analytics professionals
- - - - - - - - - - - - - -
Why learn Apache Spark?
In this era of ever growing data, the need for analyzing it for meaningful business insights is paramount. There are different big data processing alternatives like Hadoop, Spark, Storm and many more. Spark, however is unique in providing batch as well as streaming capabilities, thus making it a preferred choice for lightening fast big data analysis platforms.
The following Edureka blogs will help you understand the significance of Spark training:
5 Reasons to Learn Spark: goo.gl/7nMcS0
Apache Spark with Hadoop, Why it matters: goo.gl/I2MCeP
For more information, Please write back to us at sales@edureka.co or call us at IND: 9606058406 / US: 18338555775 (toll-free).
Instagram: / edureka_learning
Facebook: / edurekain
Twitter: / edurekain
LinkedIn: / edureka
Telegram: t.me/edurekaup...
Customer Review:
Michael Harkins, System Architect, Hortonworks says: “The courses are top rate. The best part is live instruction, with playback. But my favorite feature is viewing a previous class. Also, they are always there to answer questions, and prompt when you open an issue if you are having any trouble. Added bonus ~ you get lifetime access to the course you took!!! Edureka lets you go back later, when your boss says "I want this ASAP!" ~ This is the killer education app... I've taken two courses, and I'm taking two more.”

Комментарии • 208

@edurekaIN 6 лет назад ⁺⁶
Got a question on the topic? Please share it in the comment section below and our experts will answer it for you. For Edureka Apache Spark Certification Training Curriculum, Visit the website: bit.ly/2KHSmII
@arunasingh8617 2 года назад ⁺¹
Well explained the concept of Lazy Evaluation!
@edurekaIN 2 года назад
Good To know our videos are helping you learn better :) Stay connected with us and keep learning ! Do subscribe the channel for more updates : )
@AdalarasanSachithanantham 2 месяца назад
1st time really impressed in how the way you are teaching God bless you
@daleoking1 6 лет назад ⁺¹⁰
This makes things more clear after my Data Science class lol. Thank you so much for a great tutorial, I think this will sharpen me up.
@edurekaIN 6 лет назад ⁺¹
Hey, thank you for watching our video. Do subscribe and stay connected with us. Cheers :)
@draxutube Год назад
SO GOOD TO WATCH I UNDERSTANDED SO MUCH
@kag1984007 7 лет назад ⁺⁷
So far this is 4th course I am watching, Instructors from Edureka are amazing. Very well explained RDD in first half. Worth watching !!!
@edurekaIN 7 лет назад ⁺¹
Hey Kunal, thanks for the wonderful feedback! We're glad we could be of help.
We thought you might also like this tutorial: ruclips.net/video/uD_q4Rm4i2Q/видео.html.
You can also check out our blogs here: www.edureka.co/blog
Do subscribe to our channel to stay posted on upcoming tutorials. Cheers!
@vinulovesutube 6 лет назад ⁺²
Before starting this session I had no clue of Bigdata nor Spark . Now I have pretty decent insight . Thanks
@edurekaIN 6 лет назад
Thank you for watching our videos and appreciating our work. Do subscribe to our channel and stay connected with us. Cheers :)
@moview69 7 лет назад ⁺⁵
you are undoubtedly the king of all instructors...you rock man
@ranjeetkumar2051 2 года назад ⁺¹
thank you sir for making this video
@edurekaIN 2 года назад
Most welcome
@sarthakverma5921 3 года назад ⁺²
his teaching is pure gold
@leojames22 6 лет назад ⁺¹
One of the best video I ever watched. MapReduce was not explained in this way wherever i checked. Really thank you to post this. Use Cases are really good. Worth the time watching almost 2 hrs. 5 star to you the instructor. Very impressed.
@yitayewsolomon4906 3 года назад
thanks very much, I'm biggner for data science i got clear explanation for spark thanks alooot.
@edurekaIN 3 года назад
Thank you so much for the review ,we appreciate your efforts : ) We are glad that you have enjoyed your learning experience with us .Thank You for being a part of our Edureka team : ) Do subscribe the channel for more updates : ) Hit the bell icon to never miss an update from our channel : )
@nileshdhamanekar4545 6 лет назад ⁺⁴
Awesome session! Hats off to the instructor, you are amazing! The RDD explanation was the best
@edurekaIN 6 лет назад ⁺¹
Hey Nilesh, we are delighted to know that you liked our video. Do subscribe to our channel and stay connected with us. Cheers :)
@hymavathikalva8959 2 года назад
Very helpful section. Now I have some idea on hadoop. Nice explanation sir. Tq
@edurekaIN 2 года назад
Thank you so much : ) We are glad to be a part of your learning journey. Do subscribe the channel for more updates : ) Hit the bell icon to never miss an update from our channel : )
@mix-fz7ln Год назад
Awesome session! Hats off to the instructor,
I was searching hard to understand spark and nothing pop up to me and explained this session
amazing I love how the instructor clarify every concept and frames
you are amazing!
@edurekaIN Год назад
Good to know our contents and videos are helping you learn better . We are glad to have you with us ! Please share your mail id to send the data sheets to help you learn better :) Do subscribe the channel for more updates 😊 Hit the bell icon to never miss an update from our channel
@rmuru 7 лет назад ⁺²⁶
Excellent session...very informative..trainer is too good and explained all concepts in detail...thanks lot
@seenaiahpedipina1165 7 лет назад ⁺⁷
Good explanation and useful tutorial. Conveyed a lot just in two hours. Thank you edureka !
@edurekaIN 7 лет назад
Hey Srinu, thanks for the wonderful feedback! We're glad we could be of help.
Here's another video that we thought you might like: ruclips.net/video/uD_q4Rm4i2Q/видео.html.
Do subscribe to our channel to stay posted on upcoming tutorials. Cheers!
@arunasingh8617 2 года назад ⁺¹
I have a question here, if we have almost 60M data then creating RDD while processing the data will helps in handling such huge data or some other processing steps required?
@taniakhan71 6 лет назад ⁺²
thank you so much for this wonderful tutorial.. I have a question.. while discussing about lazy evaluation, you mentioned that for B1 to B6 RDD memory is allocated, but they remain empty till collection is invoked. My qs is.. what is the size size of the memory that is allocated for each RDD? How does the framework predict the size before hand for each RDD without processing the data? eg, B4, B5 , B6 might have different sizes and smaller or equal to B1, B2, B3 respectively... I didn't get this part. Could you please clarify?
@edurekaIN 6 лет назад
What is the size of the memory that is allocated for each RDD?
1. There is no easy way to estimate the RDD size and approximate methods were used in Spark Size Estimator's methods).
2. By default, Spark uses 60% of the configured executor memory (--executor- memory) to cache RDDs. The remaining 40% of memory is available for any objects created during task execution. In case your tasks slow down due to frequent garbage-collecting in JVM or if JVM is running out of memory, lowering this value will help reduce the memory consumption.
How does the framework predict the size before hand for each RDD without processing the data?
1. One can determine how much memory allocated to each RDD by looking at the Spark Context logs on the driver program.
2. A recommended approach when using YARN would be to use --num-executors 30 --executor-cores 4 --executor-memory 24G. Which would result in YARN allocating 30 containers with executors, 5 containers per node using up 4 executor cores each. The RAM per container on a node 124/5= 24GB (roughly).
Hope this helps :)
@ajanthanmani1 7 лет назад ⁺³¹
1.5x for people who don't have 2 hours to watch this video :)
@edurekaIN 7 лет назад ⁺¹²
Whatever rocks your boat, Ajanthan! :) Since we have learners from all backgrounds and requirements we make our tutorials as detailed as possible.
Thanks for checking out our tutorial.Do subscribe to stay posted on upcoming tutorials. We will be coming up with shorter tutorial formats too in the future. Cheers!
@shamla08 7 лет назад ⁺⁴
Very detailed presentation and a very good instructor! Thank you!
@ramsp35 5 лет назад
This one of the best and simplified Spark tutorial I have come across. 5 stars...!!!
@edurekaIN 5 лет назад
Thank you for appreciating our efforts, Ramanathan. We strive to provide quality tutorials so that people can learn easily. Do subscribe, like and share to stay connected with us. Cheers!
@moneymaker2328 6 лет назад ⁺¹
Excellent session no words to describe anything about it ...trainer is too good...worth watching
@edurekaIN 6 лет назад
Hey Apurv, thank you for watching our video and appreciating our effort. Do subscribe and stay connected with us. Cheers :)
@krutikachauhan3299 3 года назад
It was a totally new topic for me.. but still I was able to grasp it easily. Thanks to the whole team.
@edurekaIN 3 года назад ⁺¹
Hey:) Thank you so much for your sweet words :) Really means a lot ! Glad to know that our content/courses is making you learn better :) Our team is striving hard to give the best content. Keep learning with us -Team Edureka :) Don't forget to like the video and share it with maximum people:) Do subscribe the channel:)
@Successtalks2244 3 года назад
I love this edureka tutorials very much
@suvradeepbanerjee6801 Год назад
Great tutorial. Really explained things up! thanks a lot
@edurekaIN Год назад
You're welcome 😊 Glad you liked it!! Keep learning with us..
@srividyaus 7 лет назад
This is the best spark demo I have ever heard. Very clear and planned way of explaining things! Have taken up Hadoop basics classes with Edureka, which are great! Planning to enroll for spark as well. Would you explain more realtime use cases in spark training? Hadoop basics doesn't have use case explanation, which is the only drawback of the course! Great going , thanks a lot for this video.
@edurekaIN 7 лет назад
+Srividyaus thanks for the thumbs up! :) We're glad you liked our tutorial and the learning experience with Edureka!
We have communicated your feedback to our team and will work towards coming up with more real time use case videos on top of existing hands-on projects. Meanwhile, you might also find this video relevant: ruclips.net/video/zeDUx_Jf154/видео.html.
Do subscribe to our channel to stay posted on upcoming videos and please feel free to reach out in case you need any assistance. Cheers!
@theabhishekkumardotcom 6 лет назад ⁺¹
Thank you for the quick introduction on the architecture of spark....
@edurekaIN 6 лет назад
Hey Abhishek, Thank you for watching our video. Do subscribe, like and share to stay connected with us. Cheers :)
@gyanpattnaik520 7 лет назад ⁺⁴
Its an amazing video . Gives a complete concept of spark as well as its implementation in real world. Thanks
@JanacMeena 5 лет назад ⁺²
Jump to 21:25 for the example
@sahanashenoy5895 4 года назад
Amazing way of explanation. crystal clear.. way to go edurekha
@kadhirn4792 4 года назад
Great Video. He is my tutor for ML
@pritishkumar6514 5 лет назад
Loved the way how the trainer explained about it. Watched for the first time and it cleared all my doubts. Thanks, edureka.
@edurekaIN 5 лет назад
Thanks for the compliment Pritish! we are glad you loved the video. Do subscribe to the channel and hit the bell icon to never miss an update from us in the future. Cheers!
@laxmipriyapradhan8087 2 года назад
thank u sir , i just love ur teching style. is there any other vdos of urs in youtube, plz give that link
@edurekaIN 2 года назад
Hi Laxmipriya glad to hear this from you please feel free to visit our channel for more informative videos and don't forget to subscribe to get notified on our new videos
@nshettys 4 года назад
Brilliant Explanation!!! Thank you
@sagnikmukherjee5108 4 года назад
Its an awesome session. The way you explain everything with examples, its remarkable. Thanks mate.
@edurekaIN 4 года назад
Thanks for the wonderful feedback! We are glad we could help. Do subscribe to our channel to stay posted on upcoming tutorials.
@AdalarasanSachithanantham 2 месяца назад
Superb 🎉
@areejabdelaal4446 4 года назад ⁺¹
thanks a lot!
@praveenmail2him 7 лет назад ⁺²
No words to say, Spark made simple even for a laymen.
@ashishpaul8557 4 года назад
Thank You @ Edureka for doing such excellent work.
@niveditha-7555 3 года назад
Wow!! extremely impressed with this explanation
@coolprashantmailbox 7 лет назад ⁺¹
very useful video for beginners.. awesome.thank u
@iiitsrikanth 7 лет назад ⁺³
Good work Edureka Team! Really Helpful to the beginners.
@tsuyoshikittaka6636 7 лет назад ⁺⁴
wonderful tutorial ! thank you :)
@gaurisharma9039 6 лет назад
Kindly put a video on spark pipelining. I would really appreciate that. Thanks much in advance
@dhruveshshah1872 7 лет назад ⁺¹
Loved your video. Explained the basic details in a best possible way. Would wait for your new videos on this topic..Can you share the github link for the earthquake project?
@edurekaIN 7 лет назад
Hey Dhruvesh, thanks for checking out our tutorial. We're glad you liked it.
Please check out this blog for the code: www.edureka.co/blog/spark-tutorial/
You can fill in your request on the google form in the blog. Hope this helps.
Do subscribe to our channel to stay posted on upcoming tutorials. Cheers!
@IsabellaYuZhou 5 лет назад ⁺¹
1:01:07
@shubhamshingi9618 4 года назад ⁺¹
wow. Such an amazing content. Thanks, edureka for this
@SkandanKA 4 года назад
Nice, brief explanation @edureka.
Keep going with more such good tutorials.. 👍
@manishdev71 5 лет назад ⁺⁴
Excellent session.
@ajiasahamed8814 6 лет назад
Excellent session.. Trainer is fantastic and attitude . Edureka.. You are amazing in online coaching.
@umeshsawant135 6 лет назад ⁺²
Excellent session!! trainer is well experienced and good teacher as well..All the best edureka..
@edurekaIN 6 лет назад
Thank you for watching our video. Do subscribe, like and share to stay connected with us. Cheers :)
@puneethunplugged 7 лет назад ⁺³
Thank you for the crisp session. Good content and flow. Appreciate it.
@tabitha3302 7 лет назад ⁺¹
Excellent Video, Super explanation , we want like real time examples and use cases,,worth it,Awesome
@edurekaIN 7 лет назад
Hey Tabitha, thanks for the wonderful feedback! We're glad you found it useful.
Do follow our channel to stay posted on upcoming tutorials.
You can also check out our complete training here: www.edureka.co/apache-spark-scala-training.
Hope this helps. Cheers!
@ManishKumar-ni4pi 5 лет назад
The way of representation is wonderful. Thank you
@edurekaIN 5 лет назад
Thanks for the compliment Manish! We are glad you loved the video. Do subscribe to the channel and hit the bell icon to never miss an update from us in the future. Cheers!
@chandan02srivastav 5 лет назад ⁺³
Very well explained!! Amazing Tutor
@sunithachalla7840 3 года назад
awesome session...
@nagendrag2441 5 лет назад
Explanation is very good.. Thank you Now i understood overview completely...
@umashankarsaragadam8205 7 лет назад ⁺¹
Excellent explanation ...Thank you
@hemanthgowda5855 6 лет назад
Good lecture. An action is a trigger for lazy eval to start right? .collect() is not equivalent to printing..
@edurekaIN 6 лет назад
Hey Hemanth, sorry for the delay. Yes, action is a trigger for lazy evaluations to start. To print all elements on the driver, one can use the collect() method to first bring the RDD to the driver node.
Hope this helps!
@taniakhan71 6 лет назад ⁺¹
Thank you for the explanation.
@edurekaIN 6 лет назад
Thank you for watching our video. Do subscribe, like and share to stay connected with us. Cheers :)
@Yashyennam 4 года назад
This is top notch 👍👍👌
@muhammadrizwanali907 6 лет назад ⁺¹
Excellent video from the tutor. Very well defined the concepts and technology. Really appreciable
@edurekaIN 6 лет назад
Thank you for watching our video. Do subscribe, like and share to stay connected with us. Cheers :)
@1203santhu 7 лет назад ⁺¹
Session is really fantastic and informative...
@edurekaIN 7 лет назад
Hey Santhosh, thanks for checking out our tutorial! We're glad you found it useful. :) Here's another video that we thought you might like: ruclips.net/video/uD_q4Rm4i2Q/видео.html
Do subscribe to our channel to stay posted on upcoming videos. Cheers!
@nikitagupta6174 7 лет назад ⁺¹
Hi, I have few questions:
1.) Its about difference between Hadoop and Spark, you told that there are lot of i/p o/p operations in hadoop whereas in spark you said it happens only once when blocks are copied in memory and rest of the operations are performed in memory itself, so i wanted to ask when entire operation is completed so i/p o/p operation might be again required to copy the result to disk or result stays in memory itself in case of spark?
2.) Also, when we use map and reduce functions in spark python, how does those things works then? All the map operations are done in memory like that of hadoop? but what about reduce thing as reduce will merge result of two blocks so, don't you think that again network overhead will occur when we pass data from another disk to the disk in which we need to do reduce operation and the that disk will again copy that data to its memory? Can you explain how exactly it will work in case of spark?
@edurekaIN 7 лет назад ⁺³
Hey Nikita, thanks for checking out our tutorial! Here are the answers to your questions:
1. Spark doesn't work in a strict map-reduce manner and map output is not written to disk unless it is necessary. To disk are written shuffle files.
It doesn't mean that data after the shuffle is not kept in memory. Shuffle files in Spark are written mostly to avoid re-computation in case of multiple downstream actions. The difference between Spark storing data locally (on executors) and Hadoop MapReduce is that:
i.The partial results (after computing ShuffleMapStages) are saved on local hard drives not HDFS which is a distributed file system with a very expensive saves.
ii.Only some files are saved to local hard drive (after operations being pipelined) which does not happen in Hadoop MapReduce that saves all maps to HDFS.
2. Also, when we use map and reduce functions in spark python,
The Spark Python API (PySpark) exposes the Spark programming model to Python (Spark Programming Guide).
PySpark is built on top of Spark's Java API.
Data is processed in Python and cached / shuffled in the JVM.
3. All the map operations are done in memory like that of hadoop?
Yes, all the operations will be done in memory only and all the reduce operations also will be done in the same way as Hadoop because data is processed in Python and cached / shuffled in the JVM.
Hope this helps. Cheers!
@rahulmishra4111 6 лет назад
Great session ..very informative .. Can you please share the sequence of videos in Apache Spark and Scala learning playlist.. Thanks in advance
@MrAK92 6 лет назад ⁺¹
awesome class..thank u sir for proving very useful information
@edurekaIN 6 лет назад
Hey Arun! Thank you for the wonderful feedback. Do subscribe yo our channel and check out our website to know more about Apache Spark training : www.edureka.co/apache-spark-scala-training
Hope this helps. Thanks :)
@SaimanoharBoidapu 6 лет назад ⁺²
Very well explained. Thank you :)
@foradvait7591 4 года назад
Excellent. Dear trainer sir, you have amazing hold on Spark concepts. Regards
@girishahb01 7 лет назад
Nicely explained, i am in the process of learning Machine learning algorithm in Python & R. I may have to learn Spark in future :)
@edurekaIN 7 лет назад
Hey Girisha, thanks for checking out our tutorial. We're glad you found it useful.
Please feel free to check out our Spark course here: www.edureka.co/apache-spark-scala-training. You can get in touch with us anytime if you need any information or assistance. Hope this helps. Cheers!
@joa1paulo_ 4 года назад
Thanks for sharing!
@girish90 3 года назад
Excellant session!
@kavyaa1053 6 лет назад ⁺¹
thanks for this video .
@edurekaIN 6 лет назад
Hey Kavya, thank you for appreciating our work. Do subscribe and stay connected with us. Cheers :)
@ankitas7293 5 лет назад
This Shivank sirs voice .. he is a very very good trainer.
@bobslave7063 6 лет назад ⁺²
Thanks, for amazing tutorials! Very well explained.
@2007selvam 7 лет назад ⁺¹
It is very useful session.
@edurekaIN 7 лет назад
+Rangasamy Selvam, thanks for checking out our tutorial! We're glad you found it useful. Here's another video that we thought you might like: ruclips.net/video/xNAD6cBKyaA/видео.html.
Do subscribe to our channel to stay posted on upcoming tutorials. Cheers!
@manedinesh 6 лет назад ⁺¹
NIcely explained. Thanks!
@edurekaIN 6 лет назад
Hey Dinesh, thank you for watching our video. Do subscribe, like and share to stay connected with us. Cheers :)
@nihanthsreyansh2480 7 лет назад
Cheers to Edureka ! Very Well explained . Please Upload " Using Python With Apache Spark " Videos too !!
@edurekaIN 7 лет назад ⁺¹
Hey Nihanth, thanks for checking out our tutorial. We're glad you liked it.
We do not have such a tutorial at the moment, but we have communicated your request to our team and we might come up with it in the future. Do subscribe to our channel to stay posted. Cheers!
@nihanthsreyansh2480 7 лет назад ⁺¹
Thanks for the reply !
@deepikapatra1065 4 года назад ⁺¹
amazing video! too much concepts got cleared in just 2 hours:)Keep up the good work,edureka!
@efgh7906 7 лет назад
great explianation and great session
@deankommu3137 5 лет назад
nice video with a brief explanation
@safiaghani4078 6 лет назад
Hi, It is very much informative lecture ...I have a plan to write my thesis in apache spark ...could please suggest me good topic ..please it will be a great help thanks.
@edurekaIN 6 лет назад
Hey! You can refer to this thread on quora: www.quora.com/I-want-to-do-my-thesis-in-Apache-Spark-What-are-a-few-topics-or-areas-for-that
Hope this helps. Cheers :)
@theinsanify7802 5 лет назад
Thank you very much this was an amazing course .
@edurekaIN 5 лет назад ⁺¹
Thanks for the compliment, Mahdi! We are glad you loved the video. Do subscribe to the channel and hit the bell icon to never miss an update from us in the future. Cheers!
@theinsanify7802 5 лет назад
@@edurekaIN i sure did .. can't miss these contents.
@JohnWick-zc5li 6 лет назад ⁺²
Good Jobs Guys thanks.
@edurekaIN 6 лет назад
Thank you for watching our video. Do subscribe, like and share to stay connected with us. Cheers :)
@u1l2t3r4a55 6 лет назад ⁺¹
Good session!
@prabhathkota107 5 лет назад ⁺¹
Very well explained the overview of spark
@pradeepp2009 6 лет назад ⁺¹
HI all i had a doubt i had a 1 PB data to be processed in Spark. If i am trying to read whether 1PB of data will be stored in memory are not how it will process could anyone please help me,
@darisanarasimhareddy4311 7 лет назад
I completed hadoop coaching few days back. I would like to learn spark and scala .Is this 39 videos good enough for Spark AND Scala Training?
@edurekaIN 7 лет назад
+Darisa NarasimhaReddy, thanks for choosing Edureka to learn Hadoop.
About your query, these tutorials will give you a basic introduction to Spark but you will miss out on the hands-on components, assignments & doubt clarification since these are pre-recorded sessions. We'd suggest that you take up our Spark course as the next step in your learning path since Hadoop + Spark will give you tremendous career growth.
Would you like us to get in touch with you and assist you with your queries?
Hope this helps. Cheers!
@sasidharasandcube6397 6 лет назад ⁺¹
Good explanation and informative.
@edurekaIN 6 лет назад
Thank you for appreciating our work. Do subscribe, like and share to stay connected with us. Cheers :)
@arpit006 6 лет назад ⁺¹
awesome learning
@edurekaIN 6 лет назад
Thank you for watching our video. Do subscribe, like and share to stay connected with us. Cheers :)
@atindrabhattacharya2263 6 лет назад ⁺¹
I love shivank, He is awesome. thanks for this wonderful session
@edurekaIN 6 лет назад
Hey Atindra, thank you for appreciating our trainers. We are glad that you found the videos helpful Do subscribe and stay connected with us. Cheers :)
@JarinTasnimAva 6 лет назад ⁺¹
very well-described ! Amazing !
@edurekaIN 6 лет назад
Thank you for watching our video. Do subscribe, like and share to stay connected with us. Cheers :)
@sagarsinghrajpoot3832 5 лет назад
Awesome video sir 🙂
@sasikumar-gp9zd 7 лет назад
hi , useful information ...What are the pre-requisite to learn Apache Spark and Scala ? is it useful to a fresher to do this course
@edurekaIN 7 лет назад
+Sasi Kumar, thanks for checking out our tutorial! To learn Spark, a basic understanding of functional programming and object oriented programming will come in handy. Knowledge of Scala will definitely be a plus, but is not mandatory.
Spark is normally taken up by professionals with some knowledge of Hadoop. You could either up-skill with Hadoop and then follow the learning path to Apache Spark and Scala or you can directly take up Spark training.
Hadoop basics will be touched upon in our Spark training also.
You can find out more about our Hadoop training here: www.edureka.co/big-data-and-hadoop and learn more about our Spark training here: www.edureka.co/apache-spark-scala-training.
Hope this helps. Cheers!
@003vipul 6 лет назад ⁺¹
Very useful Post.
@edurekaIN 6 лет назад
Thank you for watching our video. Do subscribe, like and share to stay connected with us. Cheers :)
@balamuruganp2694 6 лет назад ⁺¹
I donno anything about hadoop system.. can you give me something information about it also
@edurekaIN 6 лет назад
Hey Balamurugan, you will find this video helpful, do give it a look: ruclips.net/video/m9v9lky3zcE/видео.html
Hope this helps :)
@rajashekarpantangi9673 7 лет назад
Very good Explanation. Awesome content.
I have a question.
When Map function is executed the results are given as a block in memory. This is fine. In the example provided in the video, the map function doesn't require any further computation( since the job is to take numbers less than 10). What about for a job like Word count.
1. How would the output of the map function be?
Is it same as Map function in MapReduce (apple,1 (apple,1) (apple,1) (banana,1),(banana,1),(banana,1),(orange,1),(orange,1),(orange,1))?
Or we can write the code for reducing also in the same map function giving output as ((apple,3) (orange,3)(banana,3))??
2. And are the blocks from each data node will be sent to a single data node to execute the further computation?? (as in reduce in map reduce)??
Thanks in Advance
@edurekaIN 7 лет назад
Hey Rajashekar, thanks for the wonderful feedback! We're glad you liked our tutorial.
This error (Unsupported major.minor version) generally appears because of using a higher JDK during compile time and lower JDK during runtime.
The default java version and you Hadoop's java version should match. For java version type in terminal >java -version this will display your current java version. For knowing the java version used by hadoop you will have to find hadoop-env.sh(in etc folder) file which contains an entry for JAVA_HOME like "export JAVA_HOME = /usr/lib/jvm/jdk1.7.0_67" or something like that. If the version of java shown by both the command are different or your hadoop-env.sh file are different then this error arises. Try setting JAVA_HOME to the path of jdk correctly to the version shown by java -version.
Hope this helps. Cheers!
@rajashekarpantangi9673 7 лет назад
I don't think u answered to my question. please read my question again and reply
thanks.
@edurekaIN 7 лет назад ⁺¹
Hey Rajashekar, here's the explanation:
1. Word count code in spark - map function is similar to hadoop mapreduce but not the same
map(func) : Return a new distributed dataset formed by passing each element of the source through a function func.
Consider the word count code - in scala
val ip =sc. textFile("file:///home/edureka/Desktop/example.txt") // loading the sample example file
val wordCounts = ip.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a, b) => a + b) // flatmap splits the word according to space delimiter and map, here assigns each word,a value of 1 and reduceByKey will add up the values having same key ie words here
wordCounts.collect // this will give output as given below:
res: Array[(String, Int)] = Array((banana,2), (orange,6), (apple,4))
2. As spark does in memory processing, only needed data is pushed to memory and processed .Here in the example flatmap , map and reduceByKey are transformation functions used , this will do Lazy evaluation, ie data will not be pushed immediately to memory/ram (transformation function will create a linage graph of RDD's) and when ever an action (collect in the code example) happens on final RDD - spark will use the lineage details and push the required data to memory
Spark does not work like hadoop - blocks are not send to a single node for processing , instead , computation/processing will happen in memory of each nodes where needed data exists and aggregated result will be send to the spark master node / client. In this way spark is faster, there is no i/o disk operations as in hadoop.
Hope this helps. Cheers!
@rajashekarpantangi9673 7 лет назад
Thanks!!
@data5508 6 лет назад ⁺¹
Brilliantly explained ((Y)
@ainunabdullah2140 6 лет назад
Very good Tutorial
@edurekaIN 6 лет назад
Hey Abdullah, thanks for the wonderful feedback! We're glad we could be of help. You can check out our complete Apache Spark course here: www.edureka.co/apache-spark-scala-training.
Do subscribe to our channel to stay posted on upcoming tutorials. Hope this helps. Cheers!
@gaurisharma9039 6 лет назад
Thank you for guiding students like us sir. Appreciate your knowledge and ability to pass it to us. It was a great session.
@edurekaIN 6 лет назад
Hey Gauri, glad you loved the session. Do subscribe to our channel and hit the bell icon to never miss an update from us in the future. Cheers!

Следующие

Автовоспроизведение

Apache Spark Full Course - Learn Apache Spark in 8 Hours | Apache Spark Tutorial | Edureka