Apache Spark Transformation and Actions

Поделиться
HTML-код
  • Опубликовано: 8 ноя 2024

Комментарии • 32

  • @IsaiahShadE
    @IsaiahShadE 3 года назад +2

    Probably the only person who tells you facts and reality in the data science community.

  • @HaridasJanjire
    @HaridasJanjire 4 года назад +2

    Very well.. Very helpful to learn Apache spark with real business end to end case.

  • @ayeshababar-fl4ev
    @ayeshababar-fl4ev 10 месяцев назад

    Very elaborate and well-explained! Can you please share the code and notebook?

  • @IsaiahShadE
    @IsaiahShadE 3 года назад +1

    Sir you are an Inspiration.

  • @KishoreKumar-yx4nw
    @KishoreKumar-yx4nw 4 года назад +1

    Thanks Srinivasan for the wonderful explanation

  • @mukeshkesavan4852
    @mukeshkesavan4852 2 года назад

    Thanks ton..! You made spark easy. Please make a video on how to optimize spark code and data skewness..

  • @mateen161
    @mateen161 4 года назад +1

    Thanks Srivatsan...Nice explanation!

  • @sudippandit1
    @sudippandit1 4 года назад +1

    Excellent presentation sir!!

  • @ranjanirajamani7565
    @ranjanirajamani7565 4 года назад +3

    Thank you, Sir my learning curve with regards to Spark has taken an exponential trend after watching your videos. It has been a rich learning experience. I have been trying to practice this parallely. I have a question regarding data frame in pyspark. When I tried to create the variable "bad_loan" using withColumn and when (for the various cases of loan_status), the variable doesnt get created in the table, though I can see it in the dataframe. When I try to access this column using a select statement, I get an error. Can you please throw some light on this?

    • @AIEngineeringLife
      @AIEngineeringLife  4 года назад

      Thanks Ranjani.. did u assign it to dataframe and use that dataframe to save. In my video I think I saved old dataframe object and not the one I assigned to new columns. Can you please validate it?

    • @ranjanirajamani7565
      @ranjanirajamani7565 4 года назад

      @@AIEngineeringLife Thank you for the response, Sir. I was able to resolve this issue. It was related to the way the when function was to be used.

  • @saurabhjain1626
    @saurabhjain1626 4 года назад +2

    Thank you for the wonderful video...I have a question as you mentioned you should use sortWithinPartitions to avoid expensive transformations when you know that the particular data is in one partition, how will you know that?? I am assuming that is only possible when you partition the data based on the values of that particular column.

  • @nagarajuch2412
    @nagarajuch2412 4 года назад +1

    Videos are all very informative.
    Is there anyway we can sort based on more than one attribute? eg: Country Ascending and Date Descending

    • @nagarajuch2412
      @nagarajuch2412 4 года назад +2

      Ans: orderBy(col("City").asc(),col("Date").desc())

    • @AIEngineeringLife
      @AIEngineeringLife  4 года назад

      @@nagarajuch2412 .. You got the answer :) .. It is there in one of my data engineering video as well

  • @taliacohen7872
    @taliacohen7872 2 года назад

    Amazing video thank you!!!!

  • @viBeotamil
    @viBeotamil 3 года назад +1

    Amazing video sir.

  • @designwithpicmaker2785
    @designwithpicmaker2785 4 года назад +1

    Thank you bro thanks for this wonderful content video

  • @AkshayKumar-xo2sk
    @AkshayKumar-xo2sk 3 года назад

    @AIEngineering - Thanks a lot for your video. May I kindly check all your spark video codes are based on python? You don't use scala/java? Whatever we do in scala/java can also be done using python?

    • @AIEngineeringLife
      @AIEngineeringLife  3 года назад

      All of my videos are using pyspark. So python is the one I have used but same can be easily done on Scala as well

    • @AkshayKumar-xo2sk
      @AkshayKumar-xo2sk 3 года назад

      @@AIEngineeringLife - do you think CCA175 cloudera certification for Apache spark and hadoop developer is good one to attempt for someone who is working as Data Engineer? Do you recommend any other certifications? And can the certification be done using Pyspark as well? Your help is highly appreciated

  • @naveenreddythirugudu
    @naveenreddythirugudu 3 года назад

    Best video 👍

  • @kkckvr
    @kkckvr 4 года назад +1

    Thanks a lot

  • @rajeevrajeev5244
    @rajeevrajeev5244 3 года назад

    Do you have this Databricks page somewhere in git?

  • @deepakparamesh8292
    @deepakparamesh8292 4 года назад

    very nice explanation sir.....could you please upload the code, sir?

    • @AIEngineeringLife
      @AIEngineeringLife  4 года назад +1

      Deepak.. Spark videos are not yet in my git repo.. It will take time to get there. Below is my repo that has other video code at this time
      github.com/srivatsan88/RUclipsLI

  • @Cricketpracticevideoarchive
    @Cricketpracticevideoarchive 4 года назад +1

    Grateful for this series
    Day 3 : colab.research.google.com/drive/1yTDcFFcUAynSXqZxjmu6UJ8bFAkEgnqV?usp=sharing&authuser=1#scrollTo=O9naSW-WLWR5

  • @kketanbhaalerao
    @kketanbhaalerao 3 года назад +1

    Please
    provide your GitHub link and also provide corona data and twitter data

    • @AIEngineeringLife
      @AIEngineeringLife  3 года назад +2

      You can find all codes here - github.com/srivatsan88/Mastering-Apache-Spark