My path to data was a little bit unsual to say the least, started to work in the financial industry using databricks and now on side projects started to work on pandas... funny that I actually used this video backwards hehe
Would this be a good tool for combining large numbers of csvs into a single dataframe quickly and then performing manipulations on that dataframe before outputting a single csv?
Fantastic introduction to PySpark for beginners. Hope to see Andrew Ray again on the stage for other presentations.
Must watch Q n A session in the end. I loved it.
Really nice how we see pandas and pyspark functions side-by-side!
yea I thought the same!
Thank you for such a great presentation for beginners!
he provided with a really good comparison between the two!
Cool talk and key differences nicely illustrated.
Here are some more videos on spark Spark Interview Questions: ruclips.net/p/PL9sbKmQTkW05mXqnq1vrrT8pCsEa53std
Volume is low! :(
use detachable speakers
This a great video. Exactly what I'm looking for thanks very much.
Thank you so much for the Session ❤️
19:12, now pandas has an SQL support
Thank you very much for your contribution.
Does it mean that using pyspark sql is the best practice in data wrangling using spark?
My path to data was a little bit unsual to say the least, started to work in the financial industry using databricks and now on side projects started to work on pandas... funny that I actually used this video backwards hehe
Super helpful, thanks for sharing!
Great intro!
PySpark is great with it's read only. It all goes badly wrong when you try and write anything with a typed schema.
7:49
great presentation!
Really helpful
I think I need a soundbox on full volume to hear this.
I've the same issue, thanks to the captions, I saved a lot of money
by just downloading and writing this code it will not work. You have to create a session.
Would this be a good tool for combining large numbers of csvs into a single dataframe quickly and then performing manipulations on that dataframe before outputting a single csv?
Which is better in databricks environment?? Python or R or SQL..reply in comments
Most people seem to find SQL better.
Nebraska Alumni
Whats with the volume?
Too quiet please fix
great tech video, but volume really ...
Hey Andrew could you send me your Github link
LOL good presentation, but unprepared for the Q &A
Why did someone ask about uDF? What does UDF have to do with spark?
Just use koalas.