Master Databricks and Apache Spark Step by Step: Lesson 1 - Introduction

Поделиться
HTML-код
  • Опубликовано: 30 янв 2025
  • НаукаНаука

Комментарии • 107

  • @faisala1037
    @faisala1037 3 года назад +110

    There's a ton of videos ( one for every keyword) on RUclips on this subject. Most fails to deliver any useful knowledge, others are too narrow and/or incomprehensible. I'm so glad to have found this series. Your teaching style took me back to my college classes. Fairly detailed and well explained. So a big thanks to you for it Bryan 👍.

    • @BryanCafferky
      @BryanCafferky  3 года назад +5

      Thanks, Faisal. If you follow the entire series, you will get a solid foundation.

    • @mansah707
      @mansah707 6 месяцев назад

      @@BryanCafferky I intend to go through this whole stuff.. Lesson 0 and 1 completed... onto lesson 2

  • @dhwanik02
    @dhwanik02 Год назад +17

    This is one of the best and clearest explanations about Spark and Databricks on the internet.

  • @bhaskarsoni1468
    @bhaskarsoni1468 8 дней назад

    The 1st class I did not sleep :). There is something special about how you explain and keep listeners hooked up. I am eager to go thru next sessions

    • @BryanCafferky
      @BryanCafferky  7 дней назад

      Thanks! I appreciate the kind words and am glad the videos are helpful.

  • @MarkFreedmanNY
    @MarkFreedmanNY Год назад +3

    Finally, a Databricks RUclips series that makes sense! I'm using DB with AWS, but this all pertains. Thanks!

  • @boubeniamohamed236
    @boubeniamohamed236 Год назад +2

    Definetly the best serie for learning databricks

  • @mandarkulkarni7675
    @mandarkulkarni7675 Год назад +2

    probably the first video that describes the difference between spark and databricks so cleanly and also the different components of spark with regards to where they are placed in the whole data engineering ecosystem .... Thanks a lot ...!!!

  • @AnandGhosh-m4d
    @AnandGhosh-m4d 10 месяцев назад +1

    Spot On! I really liked how you transitioned from the broader umbrella of Hadoop> spark> Databricks.. Great job Bryan!...

  • @bibinkunjumon
    @bibinkunjumon Год назад +1

    This is my 3 Rd teacher. You explained all well from an experienced person. I thought first what this old man gonna speak...now end up touching ur feet. Well done
    Bibin from India,Kerala

  • @kingawewome3900
    @kingawewome3900 Месяц назад

    Love the accent, As a New Englander living abroad, it made me homesick! This intro video is wicked awesome.

  • @nathanielsackey2083
    @nathanielsackey2083 3 месяца назад +1

    Man man I'm supporting this guy on patron.... What a class ...what a breakdown...you here about all these tools and 9 out of 10 times I'm drowning in them

  • @arturrizzato1034
    @arturrizzato1034 9 месяцев назад +1

    A very good class, especially for a Databricks virgin like me.

  • @abhinavkashyapv
    @abhinavkashyapv 2 года назад +7

    This video clearly explains the concepts around apache spark, databricks and the various offerings. Wonderful explanation thanks a ton 👏👍

  • @rmj5410
    @rmj5410 4 месяца назад

    Absolutely the best explanation of Databricks I've ever heard

  • @sudhamadhurikandru2708
    @sudhamadhurikandru2708 3 месяца назад

    I dont know how come i did not see your channel earlier, I am now hooked on to these, please make more and more, I like listening to your tutorials and making notes.

  • @anuraratnasiri5516
    @anuraratnasiri5516 14 дней назад

    Wonderful presentation. Thank you so much, Bryan!🙏

  • @naomilago
    @naomilago 2 года назад +6

    O M G
    I found what I was looking for.
    I've started working at Nestlé as a Data Science Analyst and I'm searching for a good playlist of Databricks and Spark to have a deeper understanding on this subject but you're the one that matched my way to learn and have lectures. A huge big thanks to you 🌟

    • @BryanCafferky
      @BryanCafferky  2 года назад +1

      Thanks so much! It is really great to hear feedback like that! Glad it helps you.

  • @CodeVeda
    @CodeVeda 29 дней назад

    finally someone is talking clearly

  • @samirks27
    @samirks27 6 месяцев назад

    Thanks Bryan for wonderful video, you kept me engaged and attentive through out of the video. Your explanation very crystal clear and one of the best on the internet. Thanks and god bless you healthy and energetic.

  • @mansah707
    @mansah707 6 месяцев назад

    I have never seen such a straightforward, clear , concise explanation on this concept. till date, i have tried to understand Apache Spark and Databricks... but i've always had some convoluted understanding of them. thank you for much for this video.. it really helped me understand where things stand now.

    • @BryanCafferky
      @BryanCafferky  6 месяцев назад +1

      Thanks. Glad the videos are helpful.

  • @Hamromerochannel
    @Hamromerochannel 7 месяцев назад

    I tried to do data bricks academy and I got lost. Thanks to channel, I understand every nook and crannies. Thumbs up Brian!!

    • @BryanCafferky
      @BryanCafferky  7 месяцев назад

      Thank you! Glad my videos are helping you.

  • @amataratsu006-xs6hv
    @amataratsu006-xs6hv 10 месяцев назад

    Sir thank you so much! You match my learning style and you have a clear voice

    • @BryanCafferky
      @BryanCafferky  10 месяцев назад

      Thanks. Glad the videos are helpful!

  • @ash2ucool
    @ash2ucool Год назад

    Thank you, Thank you, Thank you for explaining it in the simplest way possible. At last I was able to understand what are Hadoop, Spark and Databricks, and what actually they do.

    • @BryanCafferky
      @BryanCafferky  Год назад

      So glad to hear that. It's why I do this channel. Thanks

  • @dataoil8416
    @dataoil8416 2 года назад

    Exactly what I was looking for !!! your best teacher is your last mistake! proved!

  • @alexandermedina4950
    @alexandermedina4950 2 года назад +1

    Great content, thank you for doing this general and historic view, sometimes it is necessary to understand the details.

  • @voliteon
    @voliteon Год назад

    Thanks for your videos Bryan - nice work. Really good amount of information clearly explained.

  • @danielejiofor7032
    @danielejiofor7032 5 месяцев назад +1

    Best DB tutorial out there!!!

  • @animeshmohanty5052
    @animeshmohanty5052 2 года назад

    You are awesome! There's hardly any other material which is as clear and condensed. Thank you for creating this video🙏

  • @Mickley0
    @Mickley0 2 месяца назад

    Fantastic video, thank you Bryan

  • @alokhom
    @alokhom 7 месяцев назад

    your video has decluttered me a lot. Now am going to make a hdfs on my k8s cluster and spark operator

  • @hemalpbhatt
    @hemalpbhatt 5 месяцев назад

    Love your explanation! It is so easy to understand

  • @KhalilJolibois
    @KhalilJolibois 3 года назад +1

    thanks for these videos i'm finishing up the data camp data engineer track and then jumping in on these

  • @samanthamccarthy9765
    @samanthamccarthy9765 Год назад

    thanks really good summary of all these languages and how they came about .

  • @datoalavista581
    @datoalavista581 2 года назад +2

    Thank you Professor Bryan !

  • @brenthackers132
    @brenthackers132 Год назад

    Guy has two left sides and still manages to make sense. Inspiring. :)

  • @stefantodorovikj6165
    @stefantodorovikj6165 4 месяца назад

    Thank you brother you are simply amazing

  • @andreaceribelli9705
    @andreaceribelli9705 7 месяцев назад

    Incredible quality, thanks!

  • @anandchandrashekhar2933
    @anandchandrashekhar2933 2 года назад

    Great start to the series. Thank you!

  • @MeridiusMaximus
    @MeridiusMaximus 2 года назад +2

    such a clean explanation. Thank you!

  • @ishaqkhan8653
    @ishaqkhan8653 8 месяцев назад +1

    Hey Bryan, thank you for the excellent video. it put my mind at ease. I have seen that you have used Azure Databricks going forward. However my organization stores data on s3 and works predominantly in databricks platform itself. I was wondering if the knowledge you have shared will work good in direct databricks platform. I am a complete new beginner in this field, so apology for any silly questions

    • @BryanCafferky
      @BryanCafferky  8 месяцев назад

      Hi Ishaq,
      Databricks is a complete self contained service available on AWS, Azure, and GCP. It should work the same on all three with the only differences being how it integrates with the cloud specific back end services like s3. Also, Azure integrates Databricks in a way that eliminates the need for the customer to have an agreement with Databricks and Microsoft. It appears as if it were an Azure service. I think AWS requires customers to license with Databricks and AWS when they set it up. So yes, overall, all the Databricks and Spark code and services should be the same on all 3 cloud platforms. Make sense?

  • @anmolchoudhary3982
    @anmolchoudhary3982 3 года назад +1

    ohh man such a detailed and superbly structured content.... I wish I could take you out for beers sometime :)

    • @BryanCafferky
      @BryanCafferky  3 года назад

      Thanks. I appreciate the kind words. It's great to know my work is helpful.

  • @revidenver5142
    @revidenver5142 Год назад

    The Best explanation, thank you

  • @BillusTinnus
    @BillusTinnus 2 года назад +1

    Fantastic video! Really well done, thank you

    • @BryanCafferky
      @BryanCafferky  2 года назад

      Thank you! Glad they help.

    • @mehmetkaya4330
      @mehmetkaya4330 2 года назад

      I would double that! So concise yet comprehensive overview! Thank you so much!

    • @BryanCafferky
      @BryanCafferky  2 года назад

      @@mehmetkaya4330 Thanks!

  • @G47_Code
    @G47_Code 2 года назад

    Thank you Brian so much for the wonderful contents!!!

  • @scxry5597
    @scxry5597 10 месяцев назад

    Thank you so much for your videos, i have been looking for this

  • @sehaj778
    @sehaj778 3 года назад +2

    Hi Bryan, I'm currently learning Data science on GCP as a beginner. I'm just scratching the surface about learning GCP tools/platform. I wanted to learn Spark and that is why I'm here. Would learning Spark and Databricks in a 'Microsoft Azure platform' be a right idea at this time given I'm focusing on GCP ? Thanks for making this course though, I see so much content here and I'm still on the first video!

    • @BryanCafferky
      @BryanCafferky  3 года назад +3

      Databricks is a service owned by the company Databricks that is available on AWS, Azure, and GCP. It should be the same on any of these platforms with the only differences being how cloud-specific resources are called or integrated, i.e Azure Synapse vs. Google's BigQuery. You should be fine using Databricks on GCP but let me know if you find significant differences.
      Make sense?

  • @jamesschoi87
    @jamesschoi87 Год назад +1

    28:10 You couldn't install external libraries with open source spark?

    • @BryanCafferky
      @BryanCafferky  Год назад +1

      You can but you can define libraries for a cluster and Databricks will automatically re-install them ever time the cluster starts. You can even define libraries you want installed on every cluster if you like. Spark does not support cluster stop and start. You have too delete and re-create clusters if you want to stop paying for them. When you create a cluster, you have do do some work to install the libraries you want.

  • @carlosramirez-pf1zq
    @carlosramirez-pf1zq Год назад

    thank you for your explanation about spark is ,Its confuse at firts sigh are these technologies for someone that never used .

  • @IvanSedov-i7f
    @IvanSedov-i7f Год назад

    Thank you very much, it was very interesting and helpful

  • @lucassaito1791
    @lucassaito1791 2 года назад

    Outstanding content!

  • @bananaboydan3642
    @bananaboydan3642 Год назад

    This is an amazing video

  • @srajv01
    @srajv01 6 месяцев назад

    Clingon !! That's when I subscribed 😅

  • @davidk7212
    @davidk7212 7 месяцев назад

    Zank you sir for zis tutorial. It is most very velcome.

  • @jay_wright_thats_right
    @jay_wright_thats_right 3 месяца назад

    I wanted to know why we need to know this. I just felt like I was going through the motion while watching this.

    • @BryanCafferky
      @BryanCafferky  3 месяца назад +1

      Its more than just coding. You need the background and concepts to be effective. It's a long video series and if you skip the foundation, you will never gain mastery.

    • @stigmartinsen3359
      @stigmartinsen3359 2 месяца назад

      @@BryanCafferky As someone who's been researching the Apache ecosystem for the last month, trying to make sense of what's what with so much overlapping functionality, I greatly appreciate this video. Thank you for the thorough explanation. I look forward to watching and learning from the rest of the videos in this playlist about Spark and Databricks.
      With that said, since some of these videos are a bit old, would you say any of the information in them is outdated?

    • @BryanCafferky
      @BryanCafferky  2 месяца назад

      @@stigmartinsen3359 The Databricks UI has changed a lot but the functionality has stayed. New functionality has been added such as Delta Lake, Unity Catalog, and Photon. See this video for an update on these: ruclips.net/video/9YJby_COOdc/видео.html

  • @Navinneroth
    @Navinneroth 2 года назад

    Brilliant analogy sir .. phone books example.. for distributed compute too good.

  • @ThEHaCkeR1529
    @ThEHaCkeR1529 Год назад

    Thanks a lot!

  • @JCArtuso
    @JCArtuso 3 года назад

    Great! Let's go!

  • @gustavonavesdesouza759
    @gustavonavesdesouza759 10 месяцев назад

    Thanks for that

  • @rydmerlin
    @rydmerlin 2 года назад

    Is your book available in epub format?

  • @GOONER_FOREVER1989
    @GOONER_FOREVER1989 3 года назад

    How to drop cached data which was cached using delta cache into local storage ? I couldn't find a proper command.

    • @BryanCafferky
      @BryanCafferky  3 года назад

      That's a bit beyond the content of this video.

  • @artus198
    @artus198 Год назад +1

    In general , what I notice is , compared to the past, they are over-complicating everything, especially that whole Azure thing is unnecessarily complex. At least on-premise was never this much work !

    • @BryanCafferky
      @BryanCafferky  Год назад +1

      No. I disagree there. In fact, the point is that Cloud based Databricks is tons easier to use and provides much better tools than using open source Spark on prem. Not sure what you are looking at. Thanks for your comment.

    • @artus198
      @artus198 Год назад

      @@BryanCafferky Eg: In Databricks , If I want to access dbfs files in another resource group - you have to create a "scope', get access to a vault secret, use the scope to mount that dbfs in your workspace hive metastore, write a script to mount, write a script to create a temp view and read the data from that delta table.
      In SQL Server: I can share connection string user/password with somebody else, they can connect to the database from SQL Management studio, enter the details and run as many queries as they want on that database, joining multiple tables etc etc.

  • @PaulEllisBIGDATA
    @PaulEllisBIGDATA 5 дней назад

    THANK YOuuuuuuuuu!!!!!!!

  • @erkansirin6849
    @erkansirin6849 2 года назад

    Where's Kubernetes as cluster manager?

  • @youssefloukili1785
    @youssefloukili1785 2 года назад

    thanks

  • @sivachagaleti6614
    @sivachagaleti6614 2 года назад

    Awesome

  • @rohitchakravarthi94
    @rohitchakravarthi94 Год назад +1

    In real life this is something called "I stumbled and found a gold mine" !

  • @shomero8334
    @shomero8334 2 года назад

    Thank you, man! I was lost at first, I needed your Tutorial so so so so much!!

    • @BryanCafferky
      @BryanCafferky  2 года назад

      Glad it helped! I understand. It is a lot to learn.