*HARD* Twitch SQL Interview Question for Data Scientists and Data Analysts (StrataScratch 2012)
HTML-код
- Опубликовано: 8 сен 2024
- Solution and walkthrough of a real SQL interview question for Data Scientist and Data Analyst technical coding interviews. This question was asked by Twitch and is called "Viewers Turned Streamers".
Try this question on StrataScratch: platform.strat...
Find me on LinkedIn: / frederikmueller
Playlists:
StrataScratch SQL Coding Questions: • StrataScratch Coding Q...
LeetCode database SQL problems: • LeetCode Database Prob...
LeetCode easy SQL problems: • LeetCode Easy Database...
LeetCode medium SQL problems: • LeetCode Medium Databa...
LeetCode hard SQL problems: • LeetCode Hard Database...
Try this question yourself: platform.stratascratch.com/coding/2012-viewers-turned-streamers?via=frederik
Solution:
SELECT twitch_sessions.user_id, COUNT(*) AS n_sessions
FROM twitch_sessions JOIN
(SELECT DISTINCT user_id, FIRST_VALUE(session_type) OVER (PARTITION BY user_id ORDER BY session_start ASC) AS first_session
FROM twitch_sessions) user_first_session
ON twitch_sessions.user_id = user_first_session.user_id
WHERE twitch_sessions.session_type = 'streamer'
AND first_session = 'viewer'
GROUP BY twitch_sessions.user_id
ORDER BY n_sessions DESC, user_id ASC
What if there is a viewer who remained as viewer on his second session?
In that case for that user id , the streamer session count will be zero.
Solution for above usecase:
SELECT * FROM
(
SELECT twitch_sessions.user_id, COUNT(*) AS n_sessions
FROM twitch_sessions JOIN
(SELECT DISTINCT user_id, FIRST_VALUE(session_type) OVER (PARTITION BY user_id ORDER BY session_start ASC) AS first_session
FROM twitch_sessions) user_first_session
ON twitch_sessions.user_id = user_first_session.user_id
WHERE twitch_sessions.session_type = 'streamer'
AND first_session = 'viewer'
GROUP BY twitch_sessions.user_id
)
WHERE n_sessions > 0
Here's my solution using cte
with cte1
as
(
-- find the first session for each user
select user_id, session_start, session_type,
first_value(session_type) over(partition by user_id order by session_start) as first_session
from twitch_sessions
)
--select * from cte1
select user_id,
count(case when session_type = 'streamer' then 1 else null end) as total_streams
from cte1
where first_session = 'viewer'
group by user_id
SUM CASE would work as well right?
@@glstnlev Yes. SUM(case when session_type = 'streamer' then 1 else 0 end) as total_streams
You've helped me a lot with your videos. Kudos
Bro another great video ! This is what I came up with.Took me like 40 minutes lol :
WITH a as (
SELECT user_id, session_type, session_start, RANK () OVER (PARTITION BY user_id ORDER BY session_start) AS rnk
FROM twitch_sessions ),
b AS (
SELECT *
FROM a
WHERE session_type = 'viewer' AND rnk = 1 )
SELECT a.user_id, COUNT(*) AS num_of_sessions
FROM a
JOIN b ON a.user_id = b.user_id
WHERE a.rnk != 1
GROUP BY 1
ORDER BY num_of_sessions DESC , user_id
Thanks Frederik. The problem looks simple enough. Maybe there are some other expectations based on which the rating was given.