Keeping Spark on Track: Productionizing Spark for ETL: talk by Kyle Pistor and Miklos Christine

Поделиться
HTML-код
  • Опубликовано: 16 янв 2025

Комментарии • 7

  • @lackshubalasubramaniam7311
    @lackshubalasubramaniam7311 6 лет назад

    Great talk. I'm researching on implementing ETL with Databricks to come up with a guidance/framework. These best practices are really helpful.

  • @marcosoliveira8731
    @marcosoliveira8731 6 лет назад

    Really good points and explanation.

  • @djibb.7876
    @djibb.7876 7 лет назад

    Great talk!!!
    I set up a spark-cluster with 2 workers. I save a Dataframe using partitionBy ("column x") as a parquet format to some path(same path) on each worker. The matter is that i am able to save it but if i want to read it back i am getting these errors: - Could not read footer for file file´status ...... - unable to specify Schema ... Any Suggestions?

    • @emanuelisaac342
      @emanuelisaac342 3 года назад

      you all probably dont give a damn but does anyone know of a trick to log back into an instagram account?
      I was stupid forgot my password. I would love any help you can offer me.

    • @emanuelisaac342
      @emanuelisaac342 3 года назад

      @Martin Jackson Thanks so much for your reply. I got to the site through google and I'm waiting for the hacking stuff now.
      I see it takes quite some time so I will reply here later when my account password hopefully is recovered.

    • @emanuelisaac342
      @emanuelisaac342 3 года назад

      @Martin Jackson it worked and I finally got access to my account again. Im so happy!
      Thanks so much you saved my ass!

    • @martinjackson558
      @martinjackson558 3 года назад

      @Emanuel Isaac No problem :D