Great summary! One minor correction is necessary I think for the last solution: in the window function of the 1st CTE, we should add pref_delivery_date to the order by clause as 1 user could have 2 orders with pre_delivery_date J and J+3 on order date J (first day) for example, without pref_delivery_date in the order by clause, it could randomly assign order with delivery date J+3 as row no 1 and thus not consider this as immediate order whereas it should have been !
Many of you have asked me to share my presentation notes, and now… I have them for you! Download all the PDFs of my Notion pages at www.emmading.com/get-all-my-free-resources. Enjoy!
We should be very careful when using AVG(CASE WHEN...), because it will count duplicate values for denominator and it could be a problem when total count has duplicates but the question doesn't want duplicates in the denominator. For example, we are dealing with order table (duplicate values for customer column) and the question is asking for the percentage of certain customers.
customer_id 3 also had an immediate delivery right? since the dates are same in both columns of (order_date and pref_delivery_date) = 2019-10-12 or did i miss something?
No, because imediate delivery also means the dates of the columns is equal on their first delivery, but in the case of customer id 3, it is his second delivery.
Great summary! One minor correction is necessary I think for the last solution: in the window function of the 1st CTE, we should add pref_delivery_date to the order by clause as 1 user could have 2 orders with pre_delivery_date J and J+3 on order date J (first day) for example, without pref_delivery_date in the order by clause, it could randomly assign order with delivery date J+3 as row no 1 and thus not consider this as immediate order whereas it should have been !
Many of you have asked me to share my presentation notes, and now… I have them for you! Download all the PDFs of my Notion pages at www.emmading.com/get-all-my-free-resources. Enjoy!
Thanks a Ton Emma Ding.
We should be very careful when using AVG(CASE WHEN...), because it will count duplicate values for denominator and it could be a problem when total count has duplicates but the question doesn't want duplicates in the denominator. For example, we are dealing with order table (duplicate values for customer column) and the question is asking for the percentage of certain customers.
A great solution, as shown by Emma, is using window function and order_rnk=1 to guarantee denominator having unique values ( distinct customers).
Good point, although in the second example it’s a requirement to not have duplicates since a customer can only have 1 first order
Use sum, div0 and distinct
great video, please share some insights on how to answer Guesstimates
customer_id 3 also had an immediate delivery right? since the dates are same in both columns of (order_date and pref_delivery_date) = 2019-10-12 or did i miss something?
No, because imediate delivery also means the dates of the columns is equal on their first delivery, but in the case of customer id 3, it is his second delivery.
@@נדברבינוביץ Thanks , I didn't notice that 2019-10-12 was customer_id 3's second order, first being on 2019-10-09