Superb communication skills from Mark. Not that he is a fast or deep thinker but he clearly talks about where he is and slowly gets the destination. Like, "let me just make a quick detour here", "Let me put a placeholder here. I will get back to this" x 2. This makes him a good person to talk to. This is especially important for a EMs as they work daily with non-tech folks. Some nickpickings, mostly from the tech side 1. spent a little bit too much time (15 min) on estimations. 2. started talking about details like cache without giving a big picture yet. 3. the amount of friendship data is domainated by the number of edges of the graph, not the number of verteces. 4. RDS horizontal scaling is only read-scaling by adding read replicas. It cannot scale with increasing number of friendships. 5. video history definitely does not belong to user metadata table. same for updaloed videos, screen time... 6.
doing local computation drains phone battery fast and is definitely not a good solution. The interviewer was being nice and said this is "interesting".
Sites like RUclips and Tiktok deliver videos adaptively using the DASH protocol to stream 2-second video segments over HTTP. This allows the delivery to work immediately over phones, without a long lagtime to build up a buffer, and the quality adapts if the channel capacity goes down. Typically the video will have 8-9 bit rates, anywhere from 128 Kbps to 2.5 Mbps for 1080p. You can see this on RUclips if you turn on "stats for nerds". Each file is sqrt(2) times larger than the last. So you might have 2.5 MB, 1.77MB, 1.25MB, 876KB, 619KB, 437KB, 309KB, 218KB, 150KB for 9 different bit rates. That sums up to about 8MB total for each video.
Cloudfront to mobile drops latency nearly an order of magnitude compared to direct S3 retrieval depending on your location and the S3 server location, but it will always be much faster. One thing to bear in-mind is Cloudfront has a built in dead-timer cache system, and when doing real-time S3 manipulations, the cache has to be configured to drop the previously most recent cached S3 object by key name in favor of, say, an uploaded object from 30sec ago, in order for the CDN URL to serve the 30sec ago object in real-time compared to the same S3 retrieval by key name. There is some cost there, but it is true that the CDN stores data close to local nodes and the benefits are awesome from an iOS developer's perspective
I always wonder why on every system design interview people do back up the envelope calculations if none of those calculations are really used further on during the interview. Those don't even impact high-level designs in any way, because most designs end up resilient, scalable, highly available, etc.
@@jadeedstoresupport8916 But, yeah, as I said in my previous message: the assumption that the system should be large, scalable, resilient, consistent, etc, is always there. Because, basically, this IS the interest of the interviewer to see how you can manage designing large systems, not small systems. That's why you always chose technologies which can comply all those assumptions.
Kind of agree, kind of disagree. Looking at the calculations, it helps you focus on specific parts. For example, if calculations give you lots of media you might want to focus the storage/cache strategies regarding media.
it can help you to find bottlenecks and decide for database type and also it can help you to seperate services based on usage for example you can decide to use cqrs if read to write ratio is big
To add to what others have mentioned, it's good to cover these areas to give the interviewer visibility of your thought process and to let them know you are thinking about these types of things.
Great video! I think it would also be helpful to have a look at how Mark would design some real-time system (e.g. Online Auction). The focus imo should be on the immediate update and how this system would differ from regular auction (ebay)
Thank you. These are some interesting interview questions maybe you can consider as topics of interest - Design a metrics/monitoring system; Design Slack; Design logging system; Design a distributed Layer-7 api gateway ratelimiter . Thank you 🙏
Thank you! I really enjoyed your video. The best part is the way he thinks about the system. His way is very systematic, he thinks ahead a lot of the aspects of the system. Which he later applied in his process, I am really grateful that I had the opportunity to see him think.
Though design looks plausible but it has couple of major flaws: 1. Dynamo parition and sort key. Having each addition key will increase the cost cost N+N. 2. User schema will likely not work for RDS where we are keeping track of watch history and user actively. 10K mutation/s is simply not scalable on RDS. 3. Interviewer hardly talk about how to fetch the videos in order which was on of the critical aspect.
Way too much time spent on calculation. In th real interview it would be a definitive no-go. System design interviews usually last about 45ish minutes...
You can spot a cloud developer from the crowd by how much concern they have on cost optimization. Considering the scale of this project, more than a billion users, he spent an appropriate amount of time on it. Notice he relied on regions for the load balancer and moved on. You get that for free from cloud providers with very little config.
@@dontdoit6986 If you're not hired the only thing you'll be optimizing is your groceries cost. Konrad is right, in an actual interview, the candidate won't be left with enough time to actually go into the design in detail. Every minute counts.
@@NishaUchil These interviews are supposed to teach people how to perform better in SD interviews. It’s important to be as realistic as possible. If you’ve got the “key points”, good for you, move on.
Get affordable, 1-to-1 expert coaching to ace your system design interview: igotanoffer.com/en/interview-coaching/type/system-design-interview?RUclips&
Been watching the videos on your channel and I think they're really good. However, it seems like they all represent "happy path" interviews, i.e., it seems like the "interviewer" is saying a lot of, "Yeah, that sounds right. That's good". I would love to see some examples of a "typical" interview and a "bad" interview where the interviewer does a lot more "work", so to speak.
@@IGotAnOffer-Engineering a suggestion here is to take someone like Mark or other folks, and have them interview each other and then post it. otherwise, this is dangerous as you might not have the skills necessary to call out certain bad designs. Lots of beginners listen to this kind of thing
I thoroughly enjoy these videos. However, I'd love to see the interviewer drill down on some of the designs, as 99% of the time the interview here is driven by the interviewee. Sometimes the solution appears too high level, not staff+ level design.
Another cool thing S3 can do is to generate signed upload urls that can be used to POST the videos directly to S3 instead of going through an upload service.
Yeah but I think here with the current pipelines we have, uploading videos are not a big deal and it's good to have someone else as well with the same thinking . The actual problem I feel is the feed generator. We are doing the first level compressing at the client end and then uploading videos using signed urls. Then that uploaded video needs to be passed through a transcoder pipeline to generate video of different qualities so that we can handle the adaptive bitrate in the application based on network to save bandwidth and instant playback.
One thing, in my opinion a rather big thing, is adjusting the design because of legal reasons. Eg, videos uploaded by people in some country may only be stored (blob/cnd) in a certain localities (location of actual server).
Interesting stuff and the technical choices were on point! One thing i would Improve though is defining and slicing the requirements into bounded context related services. For example we could have defined the contexts: User, Video and Suggestor. Each would be represented by its own building block initially allowing to scale or break one up more, if needed. The suggestor then would have relations to User and Video and would (based on ML for example) generate video suggestions for different parts of the app. Video would be responsible for CRUDs on content and handle metadata and their blobs. The user would utlise this one when loading videos based on the suggestor result for example or the upload etc. The User context would handle user metadata, relations like friendship and possibly views. For each we can find specific solutions for their storage, scaling, concurrency etc. needs.
It seems good for an EM interview, but for a senior SWE interview I think there were opportunities of better drill downs which the interviewer missed, maybe things like, ok you just added time-in-app, how do you measure and track it?
I think using a drawing package that limited his space hindered him. He kept having to say "sorry about the lack of space". Just use a tool that gives you that space.
This is great - it would be great if you can also have some Principal Mobile Engineers come on your channel and do a Mobile App Design/Architecture interview.
You can only afford to spend 20 minutes on back-of-the-envelope calculations when you're on RUclips. For an actual interview, you've blown away half of your time. This is why system design interviews are ridiculous because TikTok wasn't designed in half hour, and real engineers actually have to think things through.
@@velvetunder3476 On the contrary, the scope is kept intentionally vague, and is not narrow. The candidate is expected to establish the scope but often times the interviewer comes with preconceived questions and areas they want to focus on based on their knowledge and experience. Again, system design interviews are pure unadulterated bs.
@abhijit-sarkar I wouldn't say they are bs in all of its entirety. I honestly think that this is also a test of a candidates ability to scope requirements effectively which comes in handy in most innovative companies where the circle of conception, design, development, and deployment needs to happen pretty quickly. A candidates ability to take a requirement and scope it down to the most important feature of business case is gold. But, like you mentioned, most interviewers come with a preconceived Notion that a candidate needs to keep things within that conceived idea and anything outside that, no matter how good, is a fail.
when you asked for number of users, do you need to wrap it back with different designs for 1000 users and 1 billion users, or just always regurgitate the generic infinite scaling distributed system response?
one big thing i always wonder about is the tempo, should i keep it nice slow steady to give my self time to think to not pause for long and keep it all smooth, or faster slower tempo ? is there a tempo preference or just follow Mark? out of all the system design videos I like Mark the most he seems the most efficient with his tools and demonstration of whats in his head to align with the interviewers head
Very good interview! Thanks for this content. I didn't get the part when he estimates the upload traffic as 1000 videos/sec * 1MB (10 Mbps). The upload traffic, in my opinion, is just 1GB/s and not 10GB/s. Where are those 10Mbps coming from?
great work. Very clear communication, and exposed thinking process and tradeoff. Plus one for the choice of DDB, S3 and CF. We are going full AWS suite LOL
Interesting that he was an Engineering manager at Google, but he decided to do the hypothetical design in AWS infrastructure terms. I don't think the specific cloud infrastructure was mentioned as part of the design question (Spotify architecture). I would have thought a former Google guy would lay out the architecture in GCP terminology, but he went straight for AWS in his thought process. Good discussion though, I enjoyed it and gave me stuff to think about.
interesting. with video-apps the egress and disk-usage are very important. with tiktok the localness of content and shortliveness of it play a key role. I would imagine you would like to use every datacenter globally you can have for video-storage for optimal load times (and to be a responsible internet-user). user-database in one well accessible datacenter sounds right. like the applicant said. surely after that there should be some clever algorithms that spread hot-videos when they catch a lot of attention and I like the pepper idea but maybe that's almost beyond system-design
Was this an interview or just a presentation ? Where are the counter questions ? I myself could think of at least 20 questions off the top of my head and this interviewer agrees to everything.
Very useful video. Thanks so much. Also if you can have one session on a website like medium blogs, considering tech- React in Frontend and NOdejs in backend and mongodb as DB, and considering scaling backend and DB. how to think from HLD and LLD perspective and scaling about the same ?
Great interview except that he left out the single most defining aspect of tiktok: its algorithm. Without that a simple key value video store. Also errored grossely on suggesting to run ML on user phones (LOL?) and in defining how the ForYou would be created in general. hint: it's more about user relations (likes and followings in common) than any ML at all. ML would not be able to actually select which videos to distribute to whom, since there's no way it could ingest all recent videos when each user requests a feed
store video as blob? for what? store video as objects and keep link on them in the db, coz IO of such blobs much more expensive. Any ideas? UPD: later during the video he said picked up AWS S3 as block storage, so all fine, I didnt know its "blob" storage, I thought about data type (e.g. MySQL blob)
Actually on the calculations that was the one and only time we did an edit, because the calculations were taking a while and I was worried people would get bored (see comment above!). So I asked him to re-take it and do them quicker.
Magic. 1080 * 1920 * 4 bytes per pixel = (8,294,400 / 1024^2 bytes = 7.9MiB) per frame. 10s @ 24 fps = 240 frames, for a total of 1896 MiB for uncompressed video. With compression, you can achieve somewhere around a 95% reduction in space, so you're looking at ballpark 100 MiB per compressed 10s video clip.
what a mess! You can offload all the video processing and manifest gen to the phone, send that to an queue to uploads the chunks of video, whos going to wait for one big chunk of video to upload? The queue keeps track of the uploads and then db insertions. On the feed back end, you only need to return video IDs and then have them fetched from the closest CDN.
I'm having trouble seeing the difference in data size between video metadata and users. One billion users, each with 200 friends, that's 200 billion rows of data. Is that not similar in size to 10 Billion per year video metadata rows?
There is an error with the writing at @18:05. I believe the viewed videos should be 1,000,000,000 (one billion) / 100,000 not 1,000,000 (one million) / 100,000 (it's missing 3 zeros). The actual answer is correct though
49:14 does metadata definitions matter if your database is nosql, aka schemaless? I feel like it is kinda weird to have interviewer sitting there watching you emphasising these kind of things
can you please explain, I can't get it, how can I get good write heavy scale with relation database if I can't shard it. for example for bank applications, where I can't use NoSQL and I need strong consistency
I think when interviewer in some moment will answer that it doesn't make sense I will get heart attack =)) but if serious, thank you for videos! I have system design interview in 3 hours and I am very nervous
@@IGotAnOffer-Engineering ow, it was hard, but I think I made it, they asked me to design a task tracker with very high load on read and write. they rated me as a junior+ within the senior graduation, thank you for asking!))
I think it's totally different in reality LOL, all those system designs are vacuum based assumptions without real use case of production infra that will be overcomplicated. No one builds at this scale from the start, you always will face legacy sh*t first then iterate over it.
great video. some suggestions for improvement. we could use NoSQL for the userdata and follow a graph schema for the followers and followee. use redis and cache the userdata and video metada if necessary. we could use SQL for video metadata since there's no join operation and introduce sharding but NoSQL works great too.
Hi, your video are great.. am not looking to pass any interview but just to better understand the topic as I have more a data scientist and math background ..can you suggest a good book of system design?
Honestly this guy yaps and rants way too much and the design seems too high level. this might be great for an EM, but I can’t imagine a Senior+ IC not going into detail like this guy and still passing their Google interview.
HI, i see there is much basic calculation for the interview. Do you have somewhere some table about these assumptions for size for text/images/video/music?
Why not a Graph DB to store the User Data instead of using Relational Database? Graph DB can be queried quickly instead of complex sql queries with RDBMS
Yours is a perfectly valid proposal. Although, I don't know if I'd categorize any query for a tik-tok application to be "complex" in terms of SQL. The underlying data itself isn't extremely complicated.
Why return the URL list to TIKTOK APP in the above diagram, why cant we get the URL's via the other APP ( on right of LB's) TIKTOK APP SERVICE?? Can TIKTOK APP SERVICE fetch original video from BLOB or CACHE (DNS) using the VIDOE_ID. Why return URL'S again?
Superb communication skills from Mark. Not that he is a fast or deep thinker but he clearly talks about where he is and slowly gets the destination. Like, "let me just make a quick detour here", "Let me put a placeholder here. I will get back to this" x 2. This makes him a good person to talk to. This is especially important for a EMs as they work daily with non-tech folks.
Some nickpickings, mostly from the tech side
1. spent a little bit too much time (15 min) on estimations.
2. started talking about details like cache without giving a big picture yet.
3. the amount of friendship data is domainated by the number of edges of the graph, not the number of verteces.
4. RDS horizontal scaling is only read-scaling by adding read replicas. It cannot scale with increasing number of friendships.
5. video history definitely does not belong to user metadata table. same for updaloed videos, screen time...
6.
doing local computation drains phone battery fast and is definitely not a good solution. The interviewer was being nice and said this is "interesting".
OK This Manager has very neat way of simplifying design. Good to have more videos from him.. (RUclips/Netflix)
TBH this isn't really a good thing. I wouldn't be confident I could pass at the mid level following the same interview
Mark: does that make sense ?
Interviewer: yeah that makes sense 🤔
yes it would be definitely more interesting if the interviewer was an actual software architect / engineer
11❤😅@@engineerprototype9191
Sites like RUclips and Tiktok deliver videos adaptively using the DASH protocol to stream 2-second video segments over HTTP. This allows the delivery to work immediately over phones, without a long lagtime to build up a buffer, and the quality adapts if the channel capacity goes down. Typically the video will have 8-9 bit rates, anywhere from 128 Kbps to 2.5 Mbps for 1080p. You can see this on RUclips if you turn on "stats for nerds". Each file is sqrt(2) times larger than the last. So you might have 2.5 MB, 1.77MB, 1.25MB, 876KB, 619KB, 437KB, 309KB, 218KB, 150KB for 9 different bit rates. That sums up to about 8MB total for each video.
Cloudfront to mobile drops latency nearly an order of magnitude compared to direct S3 retrieval depending on your location and the S3 server location, but it will always be much faster.
One thing to bear in-mind is Cloudfront has a built in dead-timer cache system, and when doing real-time S3 manipulations, the cache has to be configured to drop the previously most recent cached S3 object by key name in favor of, say, an uploaded object from 30sec ago, in order for the CDN URL to serve the 30sec ago object in real-time compared to the same S3 retrieval by key name. There is some cost there, but it is true that the CDN stores data close to local nodes and the benefits are awesome from an iOS developer's perspective
I always wonder why on every system design interview people do back up the envelope calculations if none of those calculations are really used further on during the interview. Those don't even impact high-level designs in any way, because most designs end up resilient, scalable, highly available, etc.
I think its because it helps the candidate get a sense of the scale of the system; and then choose technologies that could work well at that scale.
@@jadeedstoresupport8916 But, yeah, as I said in my previous message: the assumption that the system should be large, scalable, resilient, consistent, etc, is always there. Because, basically, this IS the interest of the interviewer to see how you can manage designing large systems, not small systems. That's why you always chose technologies which can comply all those assumptions.
Kind of agree, kind of disagree. Looking at the calculations, it helps you focus on specific parts. For example, if calculations give you lots of media you might want to focus the storage/cache strategies regarding media.
it can help you to find bottlenecks and decide for database type
and also it can help you to seperate services based on usage
for example you can decide to use cqrs if read to write ratio is big
To add to what others have mentioned, it's good to cover these areas to give the interviewer visibility of your thought process and to let them know you are thinking about these types of things.
Great video! I think it would also be helpful to have a look at how Mark would design some real-time system (e.g. Online Auction). The focus imo should be on the immediate update and how this system would differ from regular auction (ebay)
Thank you. These are some interesting interview questions maybe you can consider as topics of interest - Design a metrics/monitoring system; Design Slack; Design logging system; Design a distributed Layer-7 api gateway ratelimiter . Thank you 🙏
thanks for the suggestions, noted :)
Thank you! I really enjoyed your video. The best part is the way he thinks about the system. His way is very systematic, he thinks ahead a lot of the aspects of the system. Which he later applied in his process, I am really grateful that I had the opportunity to see him think.
Glad it was helpful!
Though design looks plausible but it has couple of major flaws:
1. Dynamo parition and sort key. Having each addition key will increase the cost cost N+N.
2. User schema will likely not work for RDS where we are keeping track of watch history and user actively. 10K mutation/s is simply not scalable on RDS.
3. Interviewer hardly talk about how to fetch the videos in order which was on of the critical aspect.
Way too much time spent on calculation. In th real interview it would be a definitive no-go. System design interviews usually last about 45ish minutes...
You can spot a cloud developer from the crowd by how much concern they have on cost optimization. Considering the scale of this project, more than a billion users, he spent an appropriate amount of time on it. Notice he relied on regions for the load balancer and moved on. You get that for free from cloud providers with very little config.
@@dontdoit6986 If you're not hired the only thing you'll be optimizing is your groceries cost. Konrad is right, in an actual interview, the candidate won't be left with enough time to actually go into the design in detail. Every minute counts.
Well said, Elon.
@@abhijit-sarkar agreed but again this is not a real interview, we don't have to copy him ditto, collect the key points and move on.
@@NishaUchil These interviews are supposed to teach people how to perform better in SD interviews. It’s important to be as realistic as possible. If you’ve got the “key points”, good for you, move on.
Get affordable, 1-to-1 expert coaching to ace your system design interview: igotanoffer.com/en/interview-coaching/type/system-design-interview?RUclips&
Been watching the videos on your channel and I think they're really good. However, it seems like they all represent "happy path" interviews, i.e., it seems like the "interviewer" is saying a lot of, "Yeah, that sounds right. That's good". I would love to see some examples of a "typical" interview and a "bad" interview where the interviewer does a lot more "work", so to speak.
Good point. The challenge is that I'm interviewing people with more expertise than me, so I find that tricky!
@@IGotAnOffer-Engineering a suggestion here is to take someone like Mark or other folks, and have them interview each other and then post it. otherwise, this is dangerous as you might not have the skills necessary to call out certain bad designs. Lots of beginners listen to this kind of thing
I thoroughly enjoy these videos.
However, I'd love to see the interviewer drill down on some of the designs, as 99% of the time the interview here is driven by the interviewee. Sometimes the solution appears too high level, not staff+ level design.
The ML part is questionable and the interview overall is tool long, it's usually 45min
If I see an answer like this video, the guy will definitely not pass the interview
lol what was wrong with it
Excellent! You can design this same system in GCP and Azure with very little modification.
Another cool thing S3 can do is to generate signed upload urls that can be used to POST the videos directly to S3 instead of going through an upload service.
Yeah but I think here with the current pipelines we have, uploading videos are not a big deal and it's good to have someone else as well with the same thinking . The actual problem I feel is the feed generator.
We are doing the first level compressing at the client end and then uploading videos using signed urls.
Then that uploaded video needs to be passed through a transcoder pipeline to generate video of different qualities so that we can handle the adaptive bitrate in the application based on network to save bandwidth and instant playback.
One thing, in my opinion a rather big thing, is adjusting the design because of legal reasons.
Eg, videos uploaded by people in some country may only be stored (blob/cnd) in a certain localities (location of actual server).
Interesting stuff and the technical choices were on point! One thing i would Improve though is defining and slicing the requirements into bounded context related services. For example we could have defined the contexts: User, Video and Suggestor. Each would be represented by its own building block initially allowing to scale or break one up more, if needed. The suggestor then would have relations to User and Video and would (based on ML for example) generate video suggestions for different parts of the app. Video would be responsible for CRUDs on content and handle metadata and their blobs. The user would utlise this one when loading videos based on the suggestor result for example or the upload etc. The User context would handle user metadata, relations like friendship and possibly views. For each we can find specific solutions for their storage, scaling, concurrency etc. needs.
Mark doing a perfect interview. Tom: "nice attempt" :))
It seems good for an EM interview, but for a senior SWE interview I think there were opportunities of better drill downs which the interviewer missed, maybe things like, ok you just added time-in-app, how do you measure and track it?
this is soo good 👏 can you please do something about banking/fintech ??
like that idea, we'll try and do one in the next couple of months
@@IGotAnOffer-Engineering hurry up
@@liftingisfun2350 ruclips.net/video/Zvr-ffhvw0Y/видео.html
wow, Mark is such a great engineer !
`Great, I wish I can talk like Mark. He is my inspiration and man crush. Thanks Mrk.
It is useful, however, I was expecting more in depth details
I think using a drawing package that limited his space hindered him. He kept having to say "sorry about the lack of space". Just use a tool that gives you that space.
This is great - it would be great if you can also have some Principal Mobile Engineers come on your channel and do a Mobile App Design/Architecture interview.
You can only afford to spend 20 minutes on back-of-the-envelope calculations when you're on RUclips. For an actual interview, you've blown away half of your time. This is why system design interviews are ridiculous because TikTok wasn't designed in half hour, and real engineers actually have to think things through.
That's why design interviews have a narrow scope. It is not expected to design the real thing.
@@velvetunder3476 On the contrary, the scope is kept intentionally vague, and is not narrow. The candidate is expected to establish the scope but often times the interviewer comes with preconceived questions and areas they want to focus on based on their knowledge and experience.
Again, system design interviews are pure unadulterated bs.
@abhijit-sarkar I wouldn't say they are bs in all of its entirety. I honestly think that this is also a test of a candidates ability to scope requirements effectively which comes in handy in most innovative companies where the circle of conception, design, development, and deployment needs to happen pretty quickly. A candidates ability to take a requirement and scope it down to the most important feature of business case is gold. But, like you mentioned, most interviewers come with a preconceived Notion that a candidate needs to keep things within that conceived idea and anything outside that, no matter how good, is a fail.
when you asked for number of users, do you need to wrap it back with different designs for 1000 users and 1 billion users, or just always regurgitate the generic infinite scaling distributed system response?
one big thing i always wonder about is the tempo, should i keep it nice slow steady to give my self time to think to not pause for long and keep it all smooth, or faster slower tempo ?
is there a tempo preference or just follow Mark?
out of all the system design videos I like Mark the most he seems the most efficient with his tools and demonstration of whats in his head to align with the interviewers head
Very good interview! Thanks for this content.
I didn't get the part when he estimates the upload traffic as 1000 videos/sec * 1MB (10 Mbps). The upload traffic, in my opinion, is just 1GB/s and not 10GB/s. Where are those 10Mbps coming from?
great work. Very clear communication, and exposed thinking process and tradeoff. Plus one for the choice of DDB, S3 and CF. We are going full AWS suite LOL
This really good stuff, looking something on banking or finance or insurance app kind off
Interesting that he was an Engineering manager at Google, but he decided to do the hypothetical design in AWS infrastructure terms. I don't think the specific cloud infrastructure was mentioned as part of the design question (Spotify architecture). I would have thought a former Google guy would lay out the architecture in GCP terminology, but he went straight for AWS in his thought process. Good discussion though, I enjoyed it and gave me stuff to think about.
The ForYou video is also called "recommendation" :)
interesting. with video-apps the egress and disk-usage are very important. with tiktok the localness of content and shortliveness of it play a key role. I would imagine you would like to use every datacenter globally you can have for video-storage for optimal load times (and to be a responsible internet-user). user-database in one well accessible datacenter sounds right. like the applicant said. surely after that there should be some clever algorithms that spread hot-videos when they catch a lot of attention and I like the pepper idea but maybe that's almost beyond system-design
Was this an interview or just a presentation ? Where are the counter questions ? I myself could think of at least 20 questions off the top of my head and this interviewer agrees to everything.
The interviewer is weak sauce.
Very useful video. Thanks so much. Also if you can have one session on a website like medium blogs, considering tech- React in Frontend and NOdejs in backend and mongodb as DB, and considering scaling backend and DB. how to think from HLD and LLD perspective and scaling about the same ?
I really enjoy watching your explanation, very inspiring
Hi, what happened to your videos with honglu? I really enjoyed both of them. please reupload or enlist if you can. thank youu
For upload and download, you probably should worry about latency, size of the video etc, instead of overall app related metrics?
Great interview except that he left out the single most defining aspect of tiktok: its algorithm. Without that a simple key value video store. Also errored grossely on suggesting to run ML on user phones (LOL?) and in defining how the ForYou would be created in general. hint: it's more about user relations (likes and followings in common) than any ML at all. ML would not be able to actually select which videos to distribute to whom, since there's no way it could ingest all recent videos when each user requests a feed
Can we some system design questions like Stock management
store video as blob? for what? store video as objects and keep link on them in the db, coz IO of such blobs much more expensive. Any ideas?
UPD: later during the video he said picked up AWS S3 as block storage, so all fine, I didnt know its "blob" storage, I thought about data type (e.g. MySQL blob)
Amazing, thank you. Can you make about popular antivirus system?
He can calculate numbers very fast!
Actually on the calculations that was the one and only time we did an edit, because the calculations were taking a while and I was worried people would get bored (see comment above!). So I asked him to re-take it and do them quicker.
Would prefer to see it all in real time like a real google interview. I can always fast forward.
How do you come from 1080*1920 pixels for a 10-second video to roughly 1MB?
Magic. 1080 * 1920 * 4 bytes per pixel = (8,294,400 / 1024^2 bytes = 7.9MiB) per frame. 10s @ 24 fps = 240 frames, for a total of 1896 MiB for uncompressed video. With compression, you can achieve somewhere around a 95% reduction in space, so you're looking at ballpark 100 MiB per compressed 10s video clip.
Any reason for not using - Cloud Pub/sub for real time , API Gateway , IAM
how important are the data calculations for this kind of interview?
Thanks for sharing.
thanks Mark!
what a mess! You can offload all the video processing and manifest gen to the phone, send that to an queue to uploads the chunks of video, whos going to wait for one big chunk of video to upload? The queue keeps track of the uploads and then db insertions. On the feed back end, you only need to return video IDs and then have them fetched from the closest CDN.
The interviewer looked super sleepy for this one. 😴
lol I commend this man, millionaire just doing mock job interviews for fun.
and he would probably fail for a L5 role
I'm having trouble seeing the difference in data size between video metadata and users. One billion users, each with 200 friends, that's 200 billion rows of data. Is that not similar in size to 10 Billion per year video metadata rows?
Can't we use graph database for users?
The interviewer is so intimidating and comes off as annoyed
It’s realistic 😂
Just perception. Can be true or not
Amazing !
Is there any book or tutorial best for learning system design
Way too much time spent in calculations at the beginning.
There is an error with the writing at @18:05. I believe the viewed videos should be 1,000,000,000 (one billion) / 100,000 not 1,000,000 (one million) / 100,000 (it's missing 3 zeros). The actual answer is correct though
49:14 does metadata definitions matter if your database is nosql, aka schemaless? I feel like it is kinda weird to have interviewer sitting there watching you emphasising these kind of things
Amazing video. Do you know where I can find more information on how regionalization would work in this scenario?
The keyword which you should be searching for is "Georouting". Here's a video on how to do it in AWS ruclips.net/video/pdlaarm8x10/видео.html
Is efficiency not an issue for system design interviews? I feel like Mark went through a not too complicated system but spent way too much time
Thanks!
Thank you. Very useful video :)
You're welcome!
Nice. But all these calculations at beginning so boring
boo hoo
What tool is being used for drawing here ?
The interviewer is focusing too much on app or showing model of the data. System design should focus on infrastructure.
can you please explain, I can't get it, how can I get good write heavy scale with relation database if I can't shard it. for example for bank applications, where I can't use NoSQL and I need strong consistency
Why's the interviewer so cold man?
what do you want him to do, blow the guy kisses?
I think when interviewer in some moment will answer that it doesn't make sense I will get heart attack =)) but if serious, thank you for videos! I have system design interview in 3 hours and I am very nervous
how did it go?!
@@IGotAnOffer-Engineering ow, it was hard, but I think I made it, they asked me to design a task tracker with very high load on read and write. they rated me as a junior+ within the senior graduation, thank you for asking!))
I think it's totally different in reality LOL, all those system designs are vacuum based assumptions without real use case of production infra that will be overcomplicated. No one builds at this scale from the start, you always will face legacy sh*t first then iterate over it.
Which is the tool used to create the diagrams
Google draw
great video. some suggestions for improvement.
we could use NoSQL for the userdata and follow a graph schema for the followers and followee.
use redis and cache the userdata and video metada if necessary.
we could use SQL for video metadata since there's no join operation and introduce sharding but NoSQL works great too.
Very nice
Why did you delete chatgpt system design interview?
Great content
What is the App used in the video to draw diagram and stuff?
Google Draw
Thanks @@IGotAnOffer-Engineering
some of the talk is application design
thank you bro
Google EM doesn't trust GCP solutions. 🤔
Very useful.
thanks Ryan, good to hear
Can you please help with the name of Diagram Drawing tool that Mark is using
Sure, it's Google Draw
Hi, your video are great.. am not looking to pass any interview but just to better understand the topic as I have more a data scientist and math background ..can you suggest a good book of system design?
sorry for the slow reply. It's interview-based but I'd still recommend Alex Xu's system design book, or his Byte Byte Go channel.
what whiteboarding tool he is using in this video ?
it's Google Draw
Honestly this guy yaps and rants way too much and the design seems too high level. this might be great for an EM, but I can’t imagine a Senior+ IC not going into detail like this guy and still passing their Google interview.
It's funny how ex Google guy prefers AWS services 😁
why exactly does an entry level college graduate need to know tiktok scale system design
Most companies no longer give system design interviews for entry level college grads, so don't worry! Google now starts from Level 5 and above.
Mr Durden are you?
umm why do calculations on arbitrary numbers LOL this guy probably spent 30 years at microsoft
Can you make one of bus ticket booking system
true need one on booking system
Okay Mr Avelino we'll see what we can do
HI, i see there is much basic calculation for the interview. Do you have somewhere some table about these assumptions for size for text/images/video/music?
Why not a Graph DB to store the User Data instead of using Relational Database? Graph DB can be queried quickly instead of complex sql queries with RDBMS
Yours is a perfectly valid proposal.
Although, I don't know if I'd categorize any query for a tik-tok application to be "complex" in terms of SQL. The underlying data itself isn't extremely complicated.
The interviewer is weak sauce, and should not be interviewing.
$ money saver enchancement = Presigned URLS to offload bandwidth
Why return the URL list to TIKTOK APP in the above diagram, why cant we get the URL's via the other APP ( on right of LB's) TIKTOK APP SERVICE?? Can TIKTOK APP SERVICE fetch original video from BLOB or CACHE (DNS) using the VIDOE_ID. Why return URL'S again?
If this happens in a real interview, will you give incline to hire this interviewer?
Azure blob by a google mate 12:10, is that admitting defeat by the giants themselves?