Read Spark DataFrame from different paths | Spark Interview Question

Поделиться
HTML-код
  • Опубликовано: 10 фев 2025
  • Hi All,
    In this video, I have explained a very interesting Spark interview question, where we have to read a spark data frame from different file paths.
    To become a GKCodelabs Extended plan member you can check the below links, and purchase the Big Data end to end pipeline course in your preferred language Python or SCALA
    PySpark course available at
    courses.gkcode...
    Spark + SCALA course available at
    courses.gkcode...
    End to End pipeline Introduction Videos:
    Pyspark End to End Pipeline
    • BIG DATA COMPLETE PROJ... ​
    Spark + Scala End to End Pipeline
    • BIG DATA complete PROJ... ​
    Starter Pack available at just: ₹549 (For Indian Payments) or $9 (For non-Indian payments)
    Extended Pack available at just: ₹1299 (For Indian Payments) or $19 (For non-Indian payments)
    Queries? Write to us at support@gkcodelabs.com
    Website: www.gkcodelabs...​ In this video I have shared my day-2 experience as a Big Data Engineer and shared with you the usual tasks, assignments, call, and routines in my life as a Big Data engineer.
    To become a GKCodelabs Extended plan member you can check the below links, and purchase the Big Data end to end pipeline course in your preferred language Python or SCALA
    PySpark course available at
    courses.gkcode...
    Spark + SCALA course available at
    courses.gkcode...
    End to End pipeline Introduction Videos:
    Pyspark End to End Pipeline
    • BIG DATA COMPLETE PROJ... ​
    Spark + Scala End to End Pipeline
    • BIG DATA complete PROJ... ​
    Starter Pack available at just: ₹549 (For Indian Payments) or $9 (For non-Indian payments)
    Extended Pack available at just: ₹1299 (For Indian Payments) or $19 (For non-Indian payments)
    Queries? Write to us at: support@gkcodelabs.com
    Website: www.gkcodelabs...

Комментарии • 11

  • @mnatum-v17
    @mnatum-v17 3 года назад

    Cool

  • @vishalgaikwad1439
    @vishalgaikwad1439 Год назад

    We can give all the files list directly inside the read method
    lis=Seq(path1, path2, path3)
    Spark.read.json(lis_*)

  • @playandlearnwithsiddhigahi1126
    @playandlearnwithsiddhigahi1126 3 года назад

    This one is awesome video sir..

  • @vishalmaurya972
    @vishalmaurya972 3 года назад

    That's great video and concept. It would be great if you also provide this working files on some locations so that we can try the hands-on.

  • @manojt7012
    @manojt7012 3 года назад

    wow. so useful

  • @gayathrilakshmi6087
    @gayathrilakshmi6087 3 года назад

    Please do these scenarios in pyspark...as pyspark is more prevalent now adays

  • @premprasun85
    @premprasun85 3 года назад

    It's awsm video ....please make video on important operation on dataframe like join filter etc.

  • @ashutoshrai5342
    @ashutoshrai5342 3 года назад

    Sir CI/Cd pe ek playlist please

    • @GKCodelabs
      @GKCodelabs  3 года назад

      Ab next usi pe le aenge dont worry👍

  • @prasanthg6506
    @prasanthg6506 3 года назад

    Hi Bro,
    Please reply me for this my interview question.
    when both data frames have large size and i need to perform join we can't use broad join right.
    In this case which join should we prefer shuffle hash or sort merge and why is it better
    note: here is join keys are completely unique keys.