SQL’s Best Kept Secret: The Window Function rows between clause.

ETL-SQL

Просмотров 16 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 12 ноя 2024

Комментарии • 65

@DiogoMatsumoto 9 месяцев назад
I never had stopped to understand this topic but I had to use it last week and I was confused. Thanks for the clear explanation!
@ETLSQL 9 месяцев назад
Happy to help. 😀
@Vaibha293 8 месяцев назад ⁺¹
Thanks for this sir..
@ETLSQL 8 месяцев назад ⁺¹
Glad you liked it 👍
@surendrag1901 2 года назад ⁺⁴
I am happy to say that I have learned new thing today. The code which you have explained we can write very easily compare than traditional approach. Very well explained. Thank you very much!
@ETLSQL 2 года назад
Glad you liked it and it helped.
@mocococo2877 9 месяцев назад
That was one close to perfect tutorial for me.
@ETLSQL 9 месяцев назад
Please let me know where did I miss the perfect score.
Ha Ha
Glad you liked it and happy to help 😊
@Reacher1998 Год назад
Thanks for explaining an advance topic in such an easy way
@ETLSQL Год назад
Happy to help.
Any other topic would you like me to cover in future videos ?
@tariq_mehmood 2 года назад ⁺¹
Thanks brother you are life saver. May god bless you.
@ETLSQL 2 года назад
Glad you liked it
@beyond_success 2 года назад ⁺¹
Gone through many videos and articles but your explanation cleared all doubts
@ETLSQL 2 года назад ⁺¹
Glad it helped.
@faizanpeerzade9381 9 месяцев назад
Very nice explanation,thanks
@ETLSQL 9 месяцев назад
Glad you liked it 👍
@HEYTHERE-ko6we Год назад
Really amazing all in 1 video very crisp and covered all the combinations. Really Thanks!
@ETLSQL Год назад
Glad you liked it 👍
@divyanshchaudhary7063 8 месяцев назад
You make it easy 👍👍
@ETLSQL 8 месяцев назад ⁺¹
Glad you liked it 👍
@priyanshusinha5002 3 года назад ⁺¹
you are always telling my current row, how you deciding on current row and in which real time scenario, we use it
@ETLSQL 3 года назад ⁺⁴
Hi Priyanshu
I will try a different approach to explain it.
Say you have 100 apples in a bucket and now you want to move all the apples to another bucket , one apple at a time. Also you want to add a serial number to apple on the basis of weight such that heaviest apple get number 1 and lightest one gets number 100. So you will first sort all the apples weight-wise and then will pick the heaviest one and mark it as 1 and put it in new bucket. This apple becomes current apple for now. Once you pick next apple for labelling it become current apple and the one already kept in new bucket is preceding one. So serial 1 is preceding, 2 is current & 3 is following.
Similarly when Teradata scans a table to retrieve all the rows on the basis of "ORDER BY" you have mentioned in OLAP function , it picks first row and apply function to it then next row and so on.
There are many use cases for this. Like if you want to compare daily sales amount with yesterday sales amount to see if today is profitable and by how much. Then you may compare current row with previous row and just subtract the sales amount.
Example
Date Sale_Amount
01-Jan 2000
02-Jan 3000
03-Jan 2500
So for 01-Jan there is no previous row so profit is NA
for 02-Jan , you do current row(3000) - previous row (2000) and get 1000 Rs profit.
for 03-Jan, you do current row (2500) - previous row (3000) and get -500 Rs as loss.
Hope this helps.
@priyanshusinha5002 3 года назад
@@ETLSQL so its like random selection and then sorting method
thanks
@ETLSQL 3 года назад
Yes. Random selection is also because teradata distributes data across AMPs. So each amp will return the rows it has and then sorting and applying corresponding analytical function.
@robertkorth605 3 года назад ⁺¹
@@ETLSQL A good example. But be careful that Apple doesn't sue you ;-)
@Bosilh 2 года назад ⁺¹
Thank you very much! Well explained!
@ETLSQL 2 года назад
Glad you liked it.
@krazykrush Год назад
Better explanation than chatgpt
@ETLSQL Год назад
Ha Ha.
@indudwivedi4068 3 года назад ⁺³
Very well explained, I have been looking for a video like this. Thank you !
@ETLSQL 3 года назад
Glad it was helpful!
@ngocmainguyen521 3 года назад ⁺²
Very well explained, thank you so much!
@ETLSQL 2 года назад
Glad it was helpful!
@alenjankovic607 2 года назад ⁺¹
Great explanations, thanks so much
@ETLSQL 2 года назад
Glad you liked it.
@tankstocks 3 года назад ⁺¹
awesome explanation. super cool.
@ETLSQL 2 года назад
Glad you liked it!
@blse2000 8 месяцев назад
Excellent, Thank you.
@ETLSQL 8 месяцев назад
Glad you liked it 👍
@zaurtskhvaradze1178 2 года назад ⁺¹
Great video, thank you!
@ETLSQL 2 года назад
Glad you liked it
@lordcommander7551 Год назад ⁺¹
what if you want the rows to be based on a date? instead of looking back 7 rows look back 7 days which could be 3 rows or it could be 15 rows?
@ETLSQL Год назад
Interesting question.
Let me know if you find the efficient solution for this.
@ankurranjan3218 3 года назад ⁺¹
This is really well explained.
@ETLSQL 3 года назад
Glad you liked it.
@sunandapoudel5538 Год назад ⁺¹
thank you
@ETLSQL Год назад
Happy to help ☺️
@paruchurijeevan 2 года назад ⁺¹
Great explanation, very useful 👍
@ETLSQL 2 года назад
Glad it was helpful!
@rahulkale869 2 года назад ⁺¹
Very useful 👍
@ETLSQL 2 года назад
Glad to hear that
@MrQwerty123627 3 года назад ⁺¹
Hey this was awesome! Great explanations, thanks so much
@ETLSQL 3 года назад
Glad it was helpful!
@a_b_h_i_a_b_h_i Год назад
THANK YOU SO MUCH!
@ETLSQL Год назад
Happy to help
@manjulpant698 4 года назад ⁺¹
Thanks for this.
@ETLSQL 3 года назад
Happy to help.
@ankurranjan3218 3 года назад ⁺¹
Will you please explain the difference between range and rows?
@rachidelamrani2290 3 года назад
It depends on the distribution of data that you have: if there are semantic gaps ( sparsity ) -> you use rows. Else ( dense distribution ) -> you use Range ... For a better understanding, go check the difference between the logical and the physical grouping within the computation window
@ETLSQL Год назад
I hope this is clear to you now
@PrakharRajAgrawal Год назад
Hello, this is really an amazing video! All that row_number & cte concept has gone.
I have 2 questions :
1. what if I want to fill the data of previous row in my current row and value is in string format not integer.
2. If I want to fill the current row from previous row data but I don't know that exactly row position where data is present (means not null) let's say for large dataset. In this case how to identify which previous row has value.
Thank you so much for explaining this concept Sir.😊
Subscribed ❤
@ETLSQL Год назад ⁺¹
Thank you & from the 2 questions, I can clearly see that you have build next level understanding of SQL concepts. So Kudos to you.
1) When we say previous row or current row or next row, there is a default sort order tagged to it. Else how come we can determine which row is previous or next. So it will not matter if the column is string or numeric , as long as ORDER BY clause has proper columns as per requirement.
Beware , 2000 > 30 but '2000' < '30' so use the order by column carefully with string values.
Generally , we use numeric or date columns.
2) You can use unbounded preceding and use MAX/MIN function (depending on the sorting) to get the value of last NOT NULL row.
Let me know if you need more input on this. I can create a quick video for you with explanation :)
@PrakharRajAgrawal Год назад
@@ETLSQL Thanks for your quick reply! Yes, I got it and understand the logic for both the questions.
You can make a video on this scenario :
We have a row with 3 columns - cust_id, status, captured_dt.
Example:
Cust_id | Status | captured_dt
11 | Opt-In | 2021-01-01
11 | Opt-out | 2021-05-01
11 | Opt-In | 2022-03-01
Now I have to expand this for each month till may-2023 where I have data like this:
Cust_id | Status | captured_dt
11 | Opt-In | 2021-01
11 | Opt-In | 2021-02
11 | Opt-In | 2021-03
11 | Opt-In | 2021-04
11 | Opt-out | 2021-05
11 | Opt-out | 2021-06
....
Like this till last month (may-2023).
PS: avoid using cte function
Sorry for long explaination! 😬 Just encounter this problem somewhere.
@BhuvaneshSrivastava 2 года назад
This is superb!!
I have 2 Questions:
1. How is the performance of this query as compared to our general group by / join / window functions? (I have a data of about 2 billion rows)
2. Also, I have a use case which involves going back 7 days on the basis of a 'date' column (since 1 day can have random number of rows hence cannot use 7 PROCEEDINGS) and doing a sum(amount). Is there a way I can achieve it using this?
@ETLSQL 2 года назад
Hi Bhuvanesh,
Glad you liked it.
1) Performance wise in most of the cases it should be better than combination of group by + join. But you may run both the SQL and check it.
2) In this case I think it is better you approach the problem by creating histogram of 7 days bucket. As the number of rows are dynamic, implementing it using straight forward windows function could be challenging here.
Best,
Nitin

Следующие

Автовоспроизведение

Solving one of PostgreSQL's biggest weaknesses.