Hi Raja, you are not mentioning dataframe name while performing window definition. Will it automatically point to dataframe created in previous cell? And what if my notebook has multiple dataframes?
Hi Raju, Dataframe must be mentioned to apply window function. I would have defined window logic separately and applied that later in a dataframe. It cannot be applied automatically on any dataframe
Hi Sir, Stuck in one scenario. Please assist input datafram is as below id description 1 "abcd pqe rrr qqq" 2 "ttt ppp ooo www iii" 3 "aaa ppp eee zzz rrrr" 4 "ssss jjjj" output dataframe should be as below last line of element id description 1 qqq 2 iii 3 rrrr 4 jjjj
@@rajasdataengineering7585 Yes Sir. By using split I am able to seperate line but I am unable to extract last element from dataframe cell. bcuz last line can be at any position. last line may be at 2nd, 3rd, 5th or any position.
Great explanation sir, thank you!
You are welcome, Jagadeeswaran! Glad it was helpful!
Nice explanation Raja 👌 👍 👏
Thank you Sravan!
Great content. But not able to find all your videos under your page. Can you please add them in one page and share the link.
You can find all videos under video tab
Hi Raja, you are not mentioning dataframe name while performing window definition. Will it automatically point to dataframe created in previous cell? And what if my notebook has multiple dataframes?
Hi Raju,
Dataframe must be mentioned to apply window function. I would have defined window logic separately and applied that later in a dataframe. It cannot be applied automatically on any dataframe
@@rajasdataengineering7585 my bad, thanks Raja
So first logic will be given and when we call data frame that will apply logic and perform the action right sir 🙌
Yes that's correct
Have you posted videos on sort merge join, autobroad cast join and shuffle join..? if not can you post that video
Have already posted video on Broadcast join
ruclips.net/video/HDiXK3Gl-hs/видео.html
yet to post on sort merge and shuffle join
Hi Sir,
Stuck in one scenario. Please assist
input datafram is as below
id description
1 "abcd
pqe
rrr
qqq"
2 "ttt
ppp
ooo
www
iii"
3 "aaa
ppp
eee
zzz
rrrr"
4 "ssss
jjjj"
output dataframe should be as below last line of element
id description
1 qqq
2 iii
3 rrrr
4 jjjj
Hi Sachin, split function can be used for this requirement
@@rajasdataengineering7585 Yes Sir. By using split I am able to seperate line but I am unable to extract last element from dataframe cell. bcuz last line can be at any position. last line may be at 2nd, 3rd, 5th or any position.
Try to use index[-1].
If not, we can calculate the number of elements and find the max of it using UDF
@@rajasdataengineering7585
Done ! Thanks 🙂
def get_last_ele(str1):
str1=str1.split("
")
f1=[]
for i in str1:
f1.append(i)
return f1[-1]
stringUDF = udf(lambda m: get_last_ele(m))
df1.withColumn("last element", stringUDF("name")).show()
Welcome 🙂
I will try to post optimized solution if any when I get sometime.
Happy to know that you solved this requirement in quick time!
Thank you
You're welcome