Swiggy Data Analyst SQL Interview Question and Answer
HTML-код
- Опубликовано: 2 май 2024
- In this video we will discuss a sql interview question asked in Swiggy for a Data Analyst Position. Here is the script:
-- Create the table
CREATE TABLE stock (
supplier_id INT,
product_id INT,
stock_quantity INT,
record_date DATE
);
-- Insert the data
delete from stock;
INSERT INTO stock (supplier_id, product_id, stock_quantity, record_date)
VALUES
(1, 1, 60, '2022-01-01'),
(1, 1, 40, '2022-01-02'),
(1, 1, 35, '2022-01-03'),
(1, 1, 45, '2022-01-04'),
(1, 1, 51, '2022-01-06'),
(1, 1, 55, '2022-01-09'),
(1, 1, 25, '2022-01-10'),
(1, 1, 48, '2022-01-11'),
(1, 1, 45, '2022-01-15'),
(1, 1, 38, '2022-01-16'),
(1, 2, 45, '2022-01-08'),
(1, 2, 40, '2022-01-09'),
(2, 1, 45, '2022-01-06'),
(2, 1, 55, '2022-01-07'),
(2, 2, 45, '2022-01-08'),
(2, 2, 48, '2022-01-09'),
(2, 2, 35, '2022-01-10'),
(2, 2, 52, '2022-01-15'),
(2, 2, 23, '2022-01-16');
Zero to hero(Advance) SQL Aggregation:
• All About SQL Aggregat...
Most Asked Join Based Interview Question:
• Most Asked SQL JOIN ba...
Solving 4 Trick SQL problems:
• Solving 4 Tricky SQL P...
Data Analyst Spotify Case Study:
• Data Analyst Spotify C...
Top 10 SQL interview Questions:
• Top 10 SQL interview Q...
Interview Question based on FULL OUTER JOIN:
• SQL Interview Question...
Playlist to master SQL :
• Complex SQL Questions ...
Rank, Dense_Rank and Row_Number:
• RANK, DENSE_RANK, ROW_...
#sql #dataanalyst #datanalytics
Please like the video for more interesting videos.
Nice one 😍 Thanks alot @Ankit Sir apke channel se concept clear ho gye sabhi
I can relate it with one of your previous videos where you had taught about creating groups. This is next level as always.
Thank you ❤
Thank you for bringing up this level of Sql Problems. Here's my Solution :-
with cte as
(
select
*, DAY(record_date) - ROW_NUMBER() over(partition by supplier_id, product_id order by record_date) grp
from
stock
where
stock_quantity < 50
)
select
supplier_id, product_id,
count(*) no_of_days,
min(record_date) first_date
from
cte
group by
supplier_id, product_id, grp
having
count(*) > 1
order by
supplier_id, product_id;
good!
Amazing and Tricky question, that combines multiple concepts in one go
Hello ankit here is my simple solution:
with cte as(select *,dateadd(day,-1*row_number()over(partition by supplier_id,product_id order by record_date),record_date) as date_grp
from stock
where stock_quantity=2;
this one is mentos Zindagi
Superb explanation Ankit 👌 👏 👍
What an Amazing solution!
Thank you for such and amazing question
The problem is a bit tricky but Ankit sir made it very easy. Thanks a lot sir
Most welcome😊
Detailed solution Ankit bhaiya is really easy to understand.... And helpful Thank you 👍❤❤
Most welcome 😊
I wonder if interviewer is willing to give 15-30 minute to sole such kind of hard problem considering the time pressure it might even take more time to think while performing all this in a live interview, also they won't be providing sql workbench environment, Also for how many year of experience candidate these type of questions are expected ? Can you please share your view @ankit ?
Right
Thanks Ankit for sharing this problem statement. Here is my approach:
with t1 as
(SELECT *
, strftime('%d', record_date) - (row_number() over(partition by supplier_id, product_id order by record_date)) as group_number
FROM stock s1
where stock_quantity < 50)
select
supplier_id,
product_id,
count(*) as total_days,
min(record_date) as first_date
from t1
group by
supplier_id,
product_id,
group_number
HAVING count(*)>=2;
with cte as (
select *, dense_rank() over(partition by supplier_id, product_id order by record_date) rn,
record_date - (dense_rank() over(partition by supplier_id, product_id order by record_date)) as datediff
from stock
where stock_quantity 1
;
using Gaps and Island approach-
with t1 as
(select supplier_id, product_id, stock_quantity, record_date,
row_number() over(partition by supplier_id, product_id order by record_date) as rnk
from stock where stock_quantity=2 order by 1,2,3
with cte as (select *,
ROW_NUMBER() over (partition by supplier_id,product_id order by record_date) as rnk
from stock
where stock_quantity < 50),
cte2 as (
SELECT
*,
DATE_SUB(record_date, interval rnk day) as grp_date
from cte)
select supplier_id,product_id,count(*) as no_of_stocks,min(record_date)
from cte2
group by supplier_id,product_id,grp_date
having count(*)>1
👌
Figured it out till DATEDIFF but couldn't figure out how to remove those single date rows.
My easy approach,
with cte as
(
SELECT *,datepart(day,record_date)-row_number() over (partition by supplier_id, product_id order by record_date) as rows_diff
FROM stock WHERE stock_quantity = 2
One Question when you are 14:17 why did not considered row 8 and 9 as a group where supplier id is and product id is 2 and they are on the consecutive days like 8th and 9th? if anyone can help me with this
Thanks @ankit
with cte as (
select *,
ROW_NUMBER()over(partition by supplier_id,product_id order by record_date asc) as rn,
DATEADD(day,-1*(ROW_NUMBER()over(partition by supplier_id,product_id order by record_date asc)),
record_date
) as keyy
from stock
where stock_quantity < 50
)
select supplier_id,product_id ,count(*) as number_of_days,
MIN(record_date) as first_date
from cte
group by supplier_id,product_id,keyy
having count(*)>1
my solution using PG
with cte as (
select * from stock where stock_quantity=2
order by supplier_id,product_id,rd
with cte as(
select *,record_date-(dense_rank() over(partition by supplier_id, product_id order by record_date)) as datediff from stock where stock_quantity=2
Hello Ankit,
Below is my approach.
WITH grp_date AS (
Select *,
dateadd(day- (ROW_NUMBER() OVER (partition by supplier_id, product_id ORDER BY record_date) -1) ),record_date ) AS NewDate from stock where stock_quantity=2
with cte as(
select *,
row_number()over(partition by product_id,supplier_id order by record_date ) as dd
from stock where stock_quantity= 2
Easy solution: with cte as (
select *, lead(record_date, 1) over (partition by supplier_id, product_id order by record_date) as next_date,
lead(record_date, 2) over (partition by supplier_id, product_id order by record_date) as next_to_date
from stock
where stock_quantity < 50
)
select supplier_id, product_id, record_date as first_date from cte
where datediff(next_date, record_date) = 1 and datediff( next_to_date, record_date) = 2
Question is very tricky.
i wrote below code for same ouput
with cte as(
select * , rank() over(partition by supplier_id order by record_date) as rn
from stock
where stock_quantity=2)
select supplier_id,product_id,no_of_days,first_date from cte2
My Solution
with cte as ( select *,
extract(day from record_date ) as day_date,
row_number() over(partition by supplier_id , product_id order by record_date ) as rn,
extract(day from record_date ) - row_number() over(partition by supplier_id , product_id order by record_date )
as flag from stock
where stock_quantity < 50 )
select supplier_id, product_id , min(record_date) as first_recorddate, count(*) as nofConsecutivedays from cte
group by supplier_id, product_id ,flag
having count(*) >= 2 ;
select supplier_id,product_id, count(rw) as cnt, min (record_date) as start_date from (
select *, concat(date_sub(record_date, INTERVAL rw day),",",supplier_id) as group_id
from (
SELECT *, row_number() over (partition by supplier_id,product_id order by record_date) as rw FROM stock
where quantity < 50
order by supplier_id))
group by group_id,1,2
having cnt>=2
order by supplier_id,product_id
My Solution If the table has same year only :)
with cte as
(select *, datepart(day, record_date)-row_number() over (order by (select null))rn
from stock)
select supplier_id, product_id, count(*)no_of_days, min(record_date) as first_date from cte
where stock_quantity < 50
group by supplier_id, product_id, rn
having count(*) >= 2
with x as (SELECT *,date_add(record_date,interval -row_number() over (partition by supplier_id ,product_id order by record_date) DAY)grp_date,
row_number() over (partition by supplier_id ,product_id order by record_date)rn
FROM stock_data)
select x.supplier_id,x.product_id,count(1)no_of_days,min(record_date)first_date from x where x.stock_quantity=2
My Solution:
with stock50 as (
select supplier_id, product_id, record_date ,
lag(record_date, 1, record_date) over(partition by supplier_id, product_id order by record_date) as prev_date
from stock
where stock_quantity < 50
)
, grouped_stock as (
select
supplier_id, product_id, record_date
, sum(case when datediff(day, prev_date, record_date)=2
Ms Sql:
with cte as(select *,
row_number() over(partition by supplier_id, product_id order by iif( stock_quantity
bahot tough ho gaya ye toh. iski mentos zindagi solution nahi hain kya?
Hello Ankit, I have purchased your SQL course and the money is also deducted but the course is not reflecting in my profile. Please help me with this,i have already mailed you with the transaction screenshot
Checking
***** My Solution in PosgreSQL without lag and lead ******
with A as (select
*,
extract(day from record_date) - row_number() over (partition by supplier_id,product_id order by record_date) as grp
from
stock
where
stock_quantity < 50)
select supplier_id,
product_id,
no_of_days,
first_date
from (
select
supplier_id,
product_id,
grp,
count(grp) as no_of_days,
min(record_date) as first_date
from
A
group by 1,2,3
having count(grp) >= 2) A
order by 1,2,4
I dint understand group_id.. can someone please explain. Remaining everything i understood
Group id is important because, in the data if you see, we have multiple dates for the same supplier_id, product_id where the output is coming. If we don't have an identifier to differentiate those same supplier_id, product_id combinations, we can't output the data in the way we want. So, in order to make it convenient for us, we are incorporating a temporary flag sort of thing (group id) that descrimantes each group and also finally helps us to get the data at a level where we can simply use minimum function to get the date for each group.
Did you understand?
@@SunilKumar_67 I understood that group id is important, but my question was how group id is being created here,, i dint understand that. I was looking for explanation on that.
@@bhartisingh3750 Acha got it. See, every time the difference in the dates is greater than 1 that means that it is the start of a new group in the same supplier and product id. If we create a new column which is basically a cumulative sum of this date diff flag, that cum_sum_flag will always be different for each group. So, since the cum_sum_flag is different for each group, we can use it to distinguish or use group by on that cum_sum_flag to get the number of days in that each group. If you still didn't understand, i suggest go step by step as explained by Ankit and see the changes in the output, you will understand it on your own
TBH very complicated solution...
with cte as(select *, dateadd(day, -1*row_number() over(partition by supplier_id, product_id order by record_date),record_date) grp
from stock
where stock_quantity1;
Kya sikha rha hai 🙄
MS SQL approach with recursive CTE
with rec as (
select record_date
from stock where record_date = '2022-01-01'
union all
select dateadd(day,1,record_date) record_date
from rec
where dateadd(day,1,record_date)
WITH InventoryCTE AS (
SELECT
supplier_id,
product_id,
record_date,
stock_quantity,
LAG(stock_quantity) OVER (PARTITION BY supplier_id, product_id ORDER BY record_date) AS prev_stock_quantity
FROM inventory
)
SELECT
supplier_id,
product_id,
MIN(record_date) AS starting_date
FROM InventoryCTE
WHERE stock_quantity < 50
AND prev_stock_quantity < 50
GROUP BY supplier_id, product_id
HAVING COUNT(*) >= 2;
He doesn't know how to explain in good way
Can we directly create a flag by using row_number ?
SELECT a.supplier_id, a.product_id, count(*), min(record_date) as [start_date]
FROM
(
SELECT supplier_id, product_id, record_date,
DateAdd(day, - ROW_NUMBER() Over(order by record_date), record_date) as flag_date
FROM
stock
WHERE
stock_quantity < 50
) a
GROUP BY a.flag_date, a.supplier_id, a.product_id
HAVING COUNT(*) >= 2
ORDER BY a.supplier_id, a.product_id
@Ankit : if you suggest any use case where above code will not work it will be more helpful to understand better way
with cte as (
SELECT *, EXTRACT('doy' FROM record_date) as doy,
ROW_NUMBER() OVER(PARTITION BY supplier_id, product_id ORDER BY record_date) as rn,
EXTRACT('doy' FROM record_date) - ROW_NUMBER() OVER(PARTITION BY supplier_id, product_id ORDER BY record_date) as group_cl FROM stock
WHERE stock_quantity= 2
ORDER BY supplier_id, product_id, first_date
Could you please explain the reason for making the column group_cl?
@@prakritigupta3477 to find the groups of consecutive days
my solution in MSSQL :
little more like of code :
with cte as
(
Select *
,row_number()over(partition by supplier_id,product_id order by record_date ) as rn
, cast (dateadd(day, row_number()over(partition by supplier_id,product_id order by record_date ) , '1900-01-01' ) as date) as new
from stock
where stock_quantity < 50
),cte2 as(
Select *,DATEDIFF(day,new,record_date) as new1 from cte
),cte3 as(
Select *
,FIRST_VALUE(record_date)over(partition by supplier_id,product_id,new1 order by supplier_id ) as new3
from cte2)
Select supplier_id,product_id,cnt, record_date from
(
Select *
,row_number()over(partition by supplier_id,new3 order by supplier_id) as rn1
,count(new3)over(partition by supplier_id,new3 order by supplier_id) as cnt from cte3
)
as a where rn1=1 and cnt > = 2
Easier Solution :
SELECT supplier_id, product_id, COUNT(*) AS total_consecutive_count, MIN(record_date) AS first_date
FROM(
SELECT supplier_id, product_id, stock_quantity, record_date,
ROW_NUMBER() OVER(PARTITION BY supplier_id, product_id ORDER BY record_date) AS rn,
DATEDIFF(DAY, ROW_NUMBER() OVER(PARTITION BY supplier_id, product_id ORDER BY record_date), DAY(record_date)) AS difference
FROM stock
WHERE stock_quantity < 50
) subquery
GROUP BY difference, supplier_id, product_id HAVING COUNT(difference) >= 2
SELECT
product_id,
supplier_id,
MIN(record_date) AS start_date
FROM
(
SELECT
product_id,
supplier_id,
record_date,
LEAD(stock_quantity) OVER (PARTITION BY product_id, supplier_id ORDER BY record_date) AS next_day_stock,
LAG(stock_quantity) OVER (PARTITION BY product_id, supplier_id ORDER BY record_date) AS prev_day_stock
FROM
supplier_inventory
) AS subquery
WHERE
stock_quantity < 50 AND
(next_day_stock < 50 OR prev_day_stock < 50)
GROUP BY
product_id,
supplier_id;
WITH cte AS (
SELECT
supplier_id,
product_id,
record_date,
DENSE_RANK() OVER (PARTITION BY supplier_id, product_id ORDER BY record_date) AS rn,
record_date - DENSE_RANK() OVER (PARTITION BY supplier_id, product_id ORDER BY record_date) AS date_diff
FROM stock
),
grouped AS (
SELECT
supplier_id,
product_id,
record_date,
min(record_date) over(PARTITION BY supplier_id,
product_id, date_diff) as start_Date,
COUNT(*) OVER (PARTITION BY supplier_id,
product_id, date_diff) AS number_of_days
FROM cte
)
SELECT
supplier_id,
product_id,
start_date AS first_date,
number_of_days
FROM grouped
WHERE number_of_days >= 2
GROUP BY supplier_id, product_id, start_date, number_of_days;
with cte as (
select * ,rank()over(partition by supplier_id,product_id order by record_date rows between unbounded preceding and current row) from stock where stock_quantity=2
order by supplier_id,product_id,cons_date)
with cte as
(select supplier_id, product_id, stock_quantity, record_date, (datepart(day, record_date)-rank() over(partition by supplier_id, product_id order by datepart(day, record_date))) as rnk from stock where stock_quantity < 50
)
select supplier_id,product_id,count(*) as no_of_records, min(record_date) as first_date from cte
group by supplier_id,product_id,rnk
having count(*) > 1