SQL Interview Problem asked during Amazon Interview
HTML-код
- Опубликовано: 22 июн 2024
- In this video, let us solve an SQL Problem asked during the Amazon Interview.
OdinSchool: hubs.la/Q02CX94v0
Download the scripts used in the video:
techtfq.com/blog/sql-intervie...
Thanks for watching!
Check out the upcoming Data Science bootcamp on OdinSchool: hubs.la/Q02CX94v0
In this bootcamp you're gonna give training or it will be done by someone else?
My solution in PostgreSQL:
WITH CTE AS
(SELECT *
, dates - (ROW_NUMBER() OVER(PARTITION BY employee, status ORDER BY dates)::INT) AS grp
FROM emp_attendance)
SELECT employee, MIN(dates) AS from_date
, MAX(dates) AS end_date, status
FROM CTE
GROUP BY employee, grp, status
ORDER BY employee, from_date
In my ROW_NUMBER function, I have partitioned by employee and status, and ordered by dates. For each employee, the data is grouped by status and ordered by dates. The row number resets to 1 whenever the status changes (from present to absent or vice-versa) within each employee's partition.
I then subtracted the row number from the date to create a group identifier (grp) to identify consecutive dates within the same status for each employee.
Thanks for the problem and explaination!. This was my solve:
with mycte as
(
SELECT *,
rank() over(partition by employee, status order by dates) as rn,
datepart(day, dates) as theday,
(datepart(day, dates) -rank() over(partition by employee, status order by dates)) as diff
from emp_attendance
)
select employee, from_date, to_date, status
from
(
select employee, diff, status, min(dates) as from_date, max(dates) as to_date
from mycte
group by employee, diff, status
) as x
order by 1,2,3
Perfect Explanation, Thanks!
Man you are legend....great explanation 😮
CTEs and window functions are new to me in learning stage, but I got this very clearly thanks for the in detail explanation
This is very usefull information Bro!
solving challenging queries from top mnc with nice explanation, great thoufiq keep it up.
Hi thoufiq! Here is my simple solution using SQL server:
with cte1 as(select employee, dates, dateadd(day,-1*(row_number()over(partition by employee,status order by dates)),dates) as date_grp,status
from employee)
select employee,min(dates) as from_date,max(dates) as to_date, status
from cte1
group by employee,date_grp,status
order by employee,from_date;
can you explain below part 😅😅
dateadd(day,-1*(row_number()over(partition by employee,status order by dates)),dates) as date_grp
@@mahivamsi9598 -1*(row_number()over(partition by employee,status order by dates) this value will give positive value so he decided to multiply with -1 so that it gets negative value so that difference can be created
Thank you TFQ
I really appreciate .
Tqs For giving Valueble Infomation.
My solution:
WITH cte AS (
SELECT *,
CASE WHEN status = LAG(status, 1, status) OVER (PARTITION BY employee ORDER BY dates) THEN 0 ELSE 1 END AS flag
FROM emp_attendance
),
cte2 AS (
SELECT employee, dates, status, SUM(flag) OVER (PARTITION BY employee ORDER BY dates) AS flag_sum
FROM cte
)
SELECT employee, MIN(dates) AS from_date, MAX(dates) AS to_date, MAX(status) AS status
FROM cte2
GROUP BY employee, flag_sum
ORDER BY employee, from_date;
Sir, Is there will be any difference i use iif inplace of case Statment???
@TFQ can we use min and max instead of first_value and last_value in the window function?
this one is a very tough question, for what level role was this question asked 😰
very hard to think about this question and finish in 30 mins
with first as (
select *,lag(status,1) over(partition by employee order by dates) as prev_status
from emp_attendance
),
second as (
select b.* from (
select *,case when status = prev_status then 'SAME' else 'CHANGE' end as status_check from first
) b
where b.status_check='CHANGE'
),
final as (
select employee ,dates as from_date ,lead(dates,1) over(partition by employee order by dates)-1 as to_date,status
from second )
select employee,from_date,coalesce(to_date,from_date),status from final
order by employee,from_date;
As always 👍
How difficult sql queries are to write on real job senario? Intermediate or hard ?
Sir may we solve this problem using lag() window function?
have a question for you @techTFQ , how much time u have taken to come up for this solution ? just curious to know an approximate time
🤣🤣
Hi comments box here is my solution:
with cte as(
SELECT *,dense_rank()over( partition by employee order by employee,dates) as rn,
dense_rank() over(partition by employee,status order by employee,dates ) as rn2 from emp_attendance),
cte1 as(
select employee,dates,status,rn-rn2 as fn from cte
order by dates)
select distinct employee,first_value(dates) over(partition by employee,fn order by dates )as from_date,last_value(dates) over(partition by employee,fn) as to_date,status from cte1
order by employee,from_da
nice to see u again, bro the last Line of your given Data is a little 0 too much
insert into emp_attendance values('A2', '2024-01-010', 'ABSENT');
the source is alsmost the same
with cte as
(select *, row_number() over(partition by employee order by employee, dates) as rn
from emp_attendance),
cte_present as
(select *, row_number() over(partition by employee order by employee, dates) AS RN2
, rn - row_number() over(partition by employee order by employee, dates) as flag
from cte where status='PRESENT'),
cte_absent as
(select *, row_number() over(partition by employee order by employee, dates) as rn3
, rn - row_number() over(partition by employee order by employee, dates) as flag
from cte where status='ABSENT' )
select employee
, first_value(dates) over(partition by employee, flag order by employee, dates) as from_date
, last_value(dates) over(partition by employee, flag order by employee, dates
range between unbounded preceding and unbounded following) as to_date
, status
from cte_present
union
select employee
, first_value(dates) over(partition by employee, flag order by employee, dates) as from_date
, last_value(dates) over(partition by employee, flag order by employee, dates
range between unbounded preceding and unbounded following) as to_date
, status
from cte_absent
order by employee, from_date
with specification rn2, rn3 in MS SQL Server
Share some tips to get into a product based company
I struggled with this. The rn - rn where status = X is a cool pattern.
this is the corrected data set
(the data set in description not included A2)
drop table if exists emp_attendance;
create table emp_attendance
(
employee varchar(10),
dates date,
status varchar(20)
);
insert into emp_attendance values('A1', '2024-01-01', 'PRESENT');
insert into emp_attendance values('A1', '2024-01-02', 'PRESENT');
insert into emp_attendance values('A1', '2024-01-03', 'PRESENT');
insert into emp_attendance values('A1', '2024-01-04', 'ABSENT');
insert into emp_attendance values('A1', '2024-01-05', 'PRESENT');
insert into emp_attendance values('A1', '2024-01-06', 'PRESENT');
insert into emp_attendance values('A1', '2024-01-07', 'ABSENT');
insert into emp_attendance values('A1', '2024-01-08', 'ABSENT');
insert into emp_attendance values('A1', '2024-01-09', 'ABSENT');
insert into emp_attendance values('A1', '2024-01-10', 'PRESENT');
insert into emp_attendance values('A2', '2024-01-06', 'PRESENT');
insert into emp_attendance values('A2', '2024-01-07', 'PRESENT');
insert into emp_attendance values('A2', '2024-01-08', 'ABSENT');
insert into emp_attendance values('A2', '2024-01-09', 'PRESENT');
insert into emp_attendance values('A2', '2024-01-10', 'ABSENT');
SELECT * from emp_attendance;
Can you please start a Snowflake Bootcamp ? Will be really helpful.
If you have any time gap
Please make a video about
Frequently asking interview questions in sql for Capgemini interview...
Solution in SQL Server
with CTE as (
select employee, dates, status, ROW_NUMBER() over(partition by employee, status order by dates)
as rn from emp_attendance), CTE2 as (
select employee, dates, status, DATEDIFF(day, rn, dates) as rn2 from CTE)
select employee, min(dates) as mindate, max(dates) as maxdates, status
from CTE2 group by employee, status, rn2 order by employee, mindate
MySql solution: with cte as (
select *, row_number() over (partition by employee, status order by dates ) as rw,
dates - row_number() over (partition by employee order by employee) as diff from emp_attendance
order by employee, dates
)
select employee, min(dates) as from_date, max(dates) as to_date, status from cte
group by employee, status, diff
Sir plz do one vd for jr data analyst interview questions and ans like pdf
You can achieve the same differentiation between employees based on status by simply using rank() and partitioning it by employee, status (as in the first cte).
WITH rank_cte AS (
SELECT
*,
rank() OVER(partition by employee, status order by dates) as r
FROM emp_attendance
ORDER BY employee, dates
),
consec_cte AS (
SELECT
*,
r - row_number() OVER() AS consec
FROM rank_cte
)
SELECT
employee,
MIN(dates) AS start_date,
MAX(dates) AS end_date,
status
FROM consec_cte
GROUP BY employee, status, consec
ORDER BY employee, start_date;
select max(amount) as thirdhighamount from orders where amount
Hello my solution in Sql Server:
WITH FLO AS (
SELECT *, CASE WHEN STATUS LAG(STATUS,1,'OPOUI')OVER(PARTITION BY EMPLOYEE ORDER BY DATES)THEN 1 ELSE 0 END
AS FLAG
FROM EMP_ATTENDANCE
), FLO1 AS (
SELECT * , SUM(FLAG)OVER(PARTITION BY EMPLOYEE ORDER BY DATES)AS GRP
FROM FLO
)
SELECT EMPLOYEE, MIN(DATES)AS FROM_DATE, MAX(DATES)AS TO_DATE, STATUS
FROM FLO1
GROUP BY EMPLOYEE, STATUS,GRP
ORDER BY EMPLOYEE, FROM_DATE
Hope it helps.
Where we can find the dataset?
In the description box click on script link and download that script you’ll get all queries
Here's my take on it via MS SQL server for given dataset
=================================================
with cte as (select *,
day(dates) - row_number() over (partition by status, employee order by dates) grp
from emp_attendance)
select employee, MIN(dates) as from_date, MAX(dates) to_date, status
from cte
group by grp, employee, status
order by employee, from_date
=================================================
I have tried to solve this in MYSQL.
with cte as (
select *,row_number() over(partition by employee order by dates) as rn,
row_number() over(partition by employee,status order by dates) as rn1
from emp_attendance
)
select employee,min(dates) as from_date,max(dates) as to_date,status
from cte
group by employee,rn-rn1,status
order by 1,2
nice
Hi Taufiq ,Please confirm my solution is how optimal?
with cte as(
SELECT *,lead(status,1,null) over(partition by employee order by dates) as next_day,min(dates) over(partition by employee) as start_day FROM emp_attendance)
select employee,date_add(LAG(DATES,1,DATE_SUB(START_DAY,1)) OVER(PARTITION BY EMPLOYEE order by dates),1) AS FROM_DATE, dates as TO_DATE,status from cte where status!=next_day or next_day is null;
with cte as(SELECT *,
lag(status,1,status)over(partition by employee order by dates) as nxt
from emp_attendance),v1 as(
select *,
sum(case when status = nxt then 0 else 1 end)over(partition by employee order by dates) as grp
from cte)
select employee,min(dates) as from_date,
max(dates) as to_date,status
from v1
group by employee, grp,status;
Sir but there should be a query related to MSSQL,because there are people who are using MSSQL only.Need a Practice session on MSSQL
Copy this query and paste that in chat get type like alter this code to work in mssql
MYSQL Solution
Select employee,Min(Dates) as From_date,Max(Dates) as End_Date,Status from
(Select *,subdate(Dates,interval Row_Number() over
(Partition by Employee,Status Order by dates) Day) as Seg from
Emp_Attendance)N group by employee,Seg order by Employee, Dates;
with cte as (select *,
Date - INTERVAL '1' DAY * (row_number() over(partition by Employee, Status order by Date asc)) as rnk
from EMP_ATD)
select
Employee,
min(Date),
max(Date),
Status
from cte
group by Employee, rnk, Status
order by Employee, min(date);
Someone can pls solve this infosys interview question,
Text1 3
Text2 5
Text3 4
Output should be
Text1
Text1
Text1
Text2
Text2
Text2
Text2
Text2
Text3
Text3
Text3
Text3
Query should be single line query.
Bro Odin school is not a good option, i wasted my time and money , They wont provide you placements , I joined in 2022 , still i am not get a job through it, pls dont waste ur time and money
with cte as(
SELECT
*,
ROW_NUMBER()OVER(PARTITION BY employee, status ORDER BY dates, status) - ROW_NUMBER()OVER(PARTITION BY employee ORDER BY dates, status) as rnk1
FROM attendance
ORDER BY 1,2
)
SELECT
employee,
min(dates) as from_date,
max(dates) as to_date,
status
FROM cte
GROUP BY employee,status ,rnk1
ORDER BY 1, 2
Isn't this approach more straight forward?
WITH grouped_attendance AS (
SELECT
employee,
dates,
status,
DATE_SUB(dates, INTERVAL ROW_NUMBER() OVER (PARTITION BY employee, status ORDER BY dates) DAY) AS group_date
FROM emp_attendance
)
SELECT
employee,
MIN(dates) AS from_date,
MAX(dates) AS to_date,
status
FROM grouped_attendance
GROUP BY employee, status, group_date
ORDER BY employee, from_date;
I guess you are always overcomplicating things don't know why!
MS SQL approach
with a as (
SELECT *, ROW_NUMBER() over(partition by employee order by dates) rn
from emp_attendance)
,b as (
select *,rn - ROW_NUMBER() over(partition by employee order by dates) rn2
from a
where status like 'PRESENT')
,c as (
select *,rn - ROW_NUMBER() over(partition by employee order by dates) rn2
from a
where status not like 'PRESENT')
select employee, status, min(dates) from_date, max(dates) to_date
from b
group by rn2, employee, status
union
select employee, status, min(dates) from_date, max(dates) to_date
from c
group by rn2, employee, status
order by 1, 3
with cte as(
select employee,dates,status,DAY(dates)-ROW_NUMBER()OVER(PARTITION BY employee order by dates)rn1
from emp_attendance
where status = 'PRESENT'),
cte2 as(
select employee,dates,status,DAY(dates)- ROW_NUMBER()over(partition by employee order by dates)rn2
from emp_attendance
where status = 'ABSENT')
select employee,MIN(dates)as FROM_DATE,MAX(dates)TO_DATE,MAX(status)as status from cte
group by employee, rn1
UNION ALL
select employee,MIN(dates),MAX(dates),MAX(status) from cte2
group by employee,rn2
ORDER BY employee,FROM_DATE,TO_DATE
Solution Given by claude 3.5 Sonnet:
WITH grouped_attendance AS (
SELECT
*,
DATE_SUB(date, INTERVAL ROW_NUMBER() OVER (PARTITION BY employee, status ORDER BY date) DAY) AS group_date
FROM employee_attendance
)
SELECT
employee,
MIN(date) AS FROM_DATE,
MAX(date) AS TO_DATE,
status
FROM grouped_attendance
GROUP BY employee, status, group_date
ORDER BY employee, FROM_DATE;
create table emp_attendance(employee varchar(200), Dates date, status varchar(200));
Insert into emp_attendance values
('A1','2024-01-01','PRESENT'),
('A1','2024-01-02','PRESENT'),
('A1','2024-01-03','PRESENT'),
('A1','2024-01-04','ABSENT'),
('A1','2024-01-05','PRESENT'),
('A1','2024-01-06','PRESENT'),
('A1','2024-01-07','ABSENT'),
('A1','2024-01-08','ABSENT'),
('A1','2024-01-09','ABSENT'),
('A1','2024-01-10','PRESENT'),
('A2','2024-01-06','PRESENT'),
('A2','2024-01-07','PRESENT'),
('A2','2024-01-08','ABSENT'),
('A2','2024-01-09','PRESENT'),
('A2','2024-01-10','ABSENT');
select * from emp_attendance;
WITH cte AS(
SELECT
EMPLOYEE
,DATES
,STATUS
,rownum - SUM(CASE WHEN STATUS = 'PRESENT' THEN 1 ELSE 1 END) OVER(PARTITION BY EMPLOYEE, STATUS ORDER BY DATES) AS test
from emp_attendance
--where EMPLOYEE = 'A1'
ORDER BY EMPLOYEE, DATES
),SUMMARY AS(
SELECT
EMPLOYEE
,status
,test
,MIN(DATES) AS FROM_DATE
,MAX(DATES) AS TO_DATE
FROM cte
GROUP BY EMPLOYEE ,status,test
ORDER BY FROM_DATE
)
SELECT
EMPLOYEE
,FROM_DATE
,TO_DATE
,status
FROM summary
ORDER BY EMPLOYEE ,FROM_DATE;