What's your opinion of spark? Think it's the future of data? Realtime data: ruclips.net/video/wEueOJ4OSlg/видео.html Big Data Pipelines: ruclips.net/video/hKv70zftW-Y/видео.html
We couldn't get it to perform. It took days to process a job. After 4 months of trying,vwe rewrote in plain java and it takes about an hour to crunch the data instead of days. Furthermore, instead of 192 CPU with 1.5tb ram, we now use minimal 2gb JVM, saving tens of thousands per month . That's my experience. As with everything else, it depends on the application and the team skills.
But what is it?!?! Is it a programming language like Python? Is it a type of hardware like a GPU? Is it a database client like MysqlWorkbench? Is it an algorithm like PageRank? Is it a computer program like excel? "Fast general purpose framework for data processing" tells me absolutely nothing. That could be talking about a library like tensorflow, an organisational strategy, a set of design principles, literally anything man.
It's a Framework (A fast, general purpose one for data). Like .Net and Ruby on rails are Frameworks. Tensorflow is also an AI and ML Framework, written in C++. But most people know it from the python library that allows python to utilize the tensorflow framework
It's a Java application written in Scala. The Executors mentioned in the video run a JVM to do the actual work (after it was assigned by the Driver which also runs inside a JVM). So essentially Spark is a distributed Java application. In addition you have interfaces in Python (pyspark library) and R (sparkR and sparklyr libraries). The SQL interface mentioned in the video is actually a higher level abstraction built (again in Scala) on top of the Spark Core. The hierarchy of abstractions in Spark is like this. The first three ones are natively written in Scala and constitute what's called Spark: Spark Core -> Spark SQL -> Spark DataFrame (-> access via e.g. pyspark)
What's your opinion of spark? Think it's the future of data?
Realtime data: ruclips.net/video/wEueOJ4OSlg/видео.html
Big Data Pipelines: ruclips.net/video/hKv70zftW-Y/видео.html
in real time data streaming, spark is a speed layer ?
We couldn't get it to perform. It took days to process a job. After 4 months of trying,vwe rewrote in plain java and it takes about an hour to crunch the data instead of days. Furthermore, instead of 192 CPU with 1.5tb ram, we now use minimal 2gb JVM, saving tens of thousands per month . That's my experience. As with everything else, it depends on the application and the team skills.
This was the best explanation of Apache Spark architecture that i found on YT. thank you.
Nicely organised and to the point.
I wish my whole academics was like this.
Excellent video and information. Thank you for explaining this for a broader audience.
Wow! Very insightful, short, informative video. Exactly how I like it
Glad you liked it!
Null queries uploads a video. Time to leave everything as is, and hit like before I watch.
Thanks for the great explanation, I was so overwhelmed with so many concepts given by the IBM course on coursera, suddenly, those concepts make sense
Nice introduction/presentation, concise, to the point. Thanks!
Which video editing software do you use? The quality is insane!
Thanks! I use the adobe products (Illustrator for objects, After Effects for animation, Audition for Audio).
Love the content and info delivery! Keep up the great work 👍
That is a book's portion summarized to the point . Many thanks
Thank you so much for this video. The explanation was very easy to understand!
Thank you for this video .
Excellent content.
Danke Ihnen Frau Navarro! das Video hat sehr geholfen!
Very interesting video! It gave Spark capabilities in a nutshell :)
Thanks
Glad you liked it!
Thank you so much, it really helps me get the idea behind it.
Glad it helped!
Excellent vid! I learnt a lot
Good video as usual
Thanks!
Tremendously amazing💖
Excellent content!
Great video! thanks
This is the video , i am looking for short but depth knowledge of spark. I was wondering whether should i learn this or not.
Useful. Thank you.
Great video, thank you!
thank you sir for making this video
OOO Best people
But what is it?!?! Is it a programming language like Python? Is it a type of hardware like a GPU? Is it a database client like MysqlWorkbench? Is it an algorithm like PageRank? Is it a computer program like excel? "Fast general purpose framework for data processing" tells me absolutely nothing. That could be talking about a library like tensorflow, an organisational strategy, a set of design principles, literally anything man.
It's a Framework (A fast, general purpose one for data). Like .Net and Ruby on rails are Frameworks. Tensorflow is also an AI and ML Framework, written in C++. But most people know it from the python library that allows python to utilize the tensorflow framework
It's a Java application written in Scala. The Executors mentioned in the video run a JVM to do the actual work (after it was assigned by the Driver which also runs inside a JVM). So essentially Spark is a distributed Java application.
In addition you have interfaces in Python (pyspark library) and R (sparkR and sparklyr libraries). The SQL interface mentioned in the video is actually a higher level abstraction built (again in Scala) on top of the Spark Core. The hierarchy of abstractions in Spark is like this. The first three ones are natively written in Scala and constitute what's called Spark: Spark Core -> Spark SQL -> Spark DataFrame (-> access via e.g. pyspark)
Its a computing engine,performs computations and provides libraries for parallel processing
rdd