Excellent video, I think in system design I tend to think it's going to have many more moving parts but this shows that sometimes it's just client-server-DB on steroids.
Yeah it’s true, this is a relatively simple example, can get potentially way crazier when taking about other common system design questions like say, how would you build Uber, or how would you build Spotify, or something
The starting part where he explained the calculations of storage and other stuff gave me a GOOSEBUMP, just imagine explaining the same flow when the interviewer asks you this!!
What a coincidence, i am actually building a url shortener as a personal project and your video has provided me with more than enough information to build on what i already have. Great video.
Heh, if you get a short url domain, you are bound by law to create an url shortener app. I got the last single letter domain in my country, so of course I did exactly that. Not going to post the link here, don't know if my tiny server would handle the traffic
great video, that demonstrate the importance of thinking a bit in advance, before start coding. Eventually we end up with a cache lookup system. I have some questions... 1) Do you consider validating the URLs? Is there a limitation? What if someone would basically start to use this as a free cache store... 2) Are these tiny URLs are public or do you need the access keys to get to the real one? 3) I am wondering if you could possibly use a simple counter and hash that, instead of the whole URL. That would be faster and the hash would have a great distribution as well. 4) If you have the same hash for the same URL, it would be hard to delete the entry later, since other client has the reference. However, that could be a "prime" feature
Saw this video, watched the rest of your videos, learned a bunch, and came back to say thanks. I really liked the system design videos. Seems like the algorithm liked it too. Keep up the good work and see you in the next one.
Great video, you have a subscriber! Had a couple of questions about the shortening approaches: 1. On the key-generation approach: What's the rationale behind pre-generating keys? Are you trying to avoid a uniqueness check at creation time? Would the UNIQUE index on an SQL table be too big/slow? 2. On the hashing approach: Does the hash function guarantee equal distribution amongst the buckets? Not sure if picking the first letter out of the hash guarantees that. If not, perhaps re-hashing the hash with a function that guarantees a uniformly random output might do the trick. All this to say that skewed shards might be a big problem.
Great video. Couple of questions- 1. You said there shouldn't be more than 1 short urls for a single long url. So do you check in the database if a long url already has a short url or does the "hash" function takes care of it? 2. Do you just return the same short url for a different person if he has requested for long url already created by a previous person? If yes then why do you store the userId in the table
yes! great questions. 1. In theory the hashing technique (or the random keygen + dedicated db technique) should always create unique hashes so there should be no collisions. But it's cheap to just double check the DB and make sure no shortURL already exists so there are no duplicates/collisions. So we can just go ahead and do that as well. 2. That's a great point, I honestly hadn't really considered. I think for user experience you would want to give person 2 a fresh unique tinyURL (especially if they are requesting a custom tiny url). So there would be 2 different entries in the database where the keys are different tiny hashes, but the values both include the same long URL. so to your point, yeah the userID might not be necessary
Great video. Principal Engineer here, learned a thing or two and you touched on just enough. I liked the stats, however depending on how popular this service is these things would change a bit. An idea for more content would be to break these up into something like a hobby project, medium size (whatever that means, and enterprise level (what you displayed here). Regardless loved it!
When talking about choosing between RDS vs NoSQL, IMO I was a little uncertain when it mentioned strong (RDS) vs eventual consistency (NoSQL). To RDS with a single instance, it might be confident to state that it can align with strong consistency, but when comes with replication nodes, RDS may also not guarantee consistency
Hi there, I really loved the way you explain, excellent video! Do you mind to share the tools you used to record this video and create this presentation?
The estimation of cache size would be good enough? I think we should estimate cache size based on shorten urls that are more getting heated then others url. So, If it's only 20% of shorten urls that are created with in a month, we can roughly expect that it would be ((500 bytes * 100 mil) / 100) * 20 = around 9 GB.
Food for thought: What if we did not want the tiny url to be stored forever? Say we want it to be available for a short span of say 2 hours? What's the approach?
Thanks for the great video! It covers many important points. However, I think the SQL vs. NoSQL explanation isn't entirely accurate. Eventual consistency isn't exclusive to NoSQL databases; both relational (SQL) and NoSQL databases can typically be configured to replicate writes either synchronously or asynchronously. The ACID properties relate to transaction management and do not address data consistency across database replicas. Word "Consistency" in case of ACID means "DB ensures that a transaction can only bring the database from one consistent state to another, preserving database invariants like key uniqueness etc"
Hi In article with base64 conversion there is a problem to address. We have a counter that is shared across servers, so that counter is critical section and race condition might arrive as diff servers can read same value and generate same short-url. We must ensure mutual exclucion in that case. Correct me if i am wrong :)
Crazy to think how much there is to even a simple seeming thing and then you realize there is still so much missing like authentication, authorization, payment handling, plans, email confirmations, internationalization of the website, possibly rate limiting for non paying customers…
I think a more realistic solution would use cloud providers and their services which for this case is even more simple with some KV storage and serverless functions
No particular reason for those specific numbers. This is a napkin calculation so the idea is to say “here we’ll get way more reads than writes”. for this exercise it wouldn’t make much difference if the ratio was 400:1 or 1000:50 etc.
So, you just make up those numbers and define your own requirements and through a bunch of servers to do these things. I don't understand why people take system design interview. that's just seem to me common sense to load balance your requirements. Any software dev should know them.
How exactly will I earn money with that scale? 😮 Service should also provide some statistics, so users know if they draw any traffic and when. Add these to your calculations - data + load You said 100 years - where is expiry date? How often do you check when to delete? 100 years idea is just stupid overkill, stretching whole budget, give few years or check traffic to each link and if something seems not used then delete it.
Intuitively you are correct my friend but if you ever go Into an interview it’s important to clarify why, and how, you are shortening it first. Always the first thing you should do is ask clarifying questions
one of those videos where you get most from viewing time - very concise, effective, no side bs, covers edge cases etc 💜
Amazed by the simplicity with which you explained everything, I think I will never forget a url shortner design now
Excellent video, I think in system design I tend to think it's going to have many more moving parts but this shows that sometimes it's just client-server-DB on steroids.
Yeah it’s true, this is a relatively simple example, can get potentially way crazier when taking about other common system design questions like say, how would you build Uber, or how would you build Spotify, or something
The starting part where he explained the calculations of storage and other stuff gave me a GOOSEBUMP, just imagine explaining the same flow when the interviewer asks you this!!
Solid, concise and the perfect warm up for every time I'm doing a last minute refresh before an interview.
What a coincidence, i am actually building a url shortener as a personal project and your video has provided me with more than enough information to build on what i already have. Great video.
Heh, if you get a short url domain, you are bound by law to create an url shortener app. I got the last single letter domain in my country, so of course I did exactly that. Not going to post the link here, don't know if my tiny server would handle the traffic
@@jonragnarssonwhat's a short url domain? also, does that mean x (twitter) should provide a url shortening service on the new domain as well?
10/10 explanations and visuals. We need more!
15 mins of video and hours of value, great video Loved it👍
Would love to see more videos like this on designing systems
Love this content wish there was more system design videos like this on RUclips. Thank you.
Great video! Got an interview next week which I was told I’m gonna be designing a system similar to the one in the video, this helped me a lot.
Awesome! Appreciate you watching
Simple and just deep enough for non coder to understand. Awsome !
This is the best explanation of this I have seen. Thank you!
Great video
Love the look into the details
What a nice piece of knowledge and thought delivery 👏👏
Really good explanation.
Very nice breakdown, reveals a lot of useful information.
Do more system design videos. Absolutely loved it 👍
Thank you so much for the quality content. Please make a series of system design! I feel so struggle and inefficient when designing complex system.
Great explaination...............best one so faaaaaaar
Very nice
I love this kind of videos covering the theoretical part of programming! Great information on system architecture😁
Love the way you explain things in a smiple way! Would love to see more system design viddoes from you!
thanks! just published a new one
@@codetour YESSSSIR! Can't wait to watch it tonight when I get home!
Great video, It is simple, informative and ease to understand
Thanks a lot
Glad the algo picked up your video your channel is really good
Loved your explanation. Looking forward for more videos.
Great tutorial🎉❤
Love It, simple and informative
I love the content that youtube has finally started droppijg in my recommended
Great video and easy to follow explanation! Looking forward to more
pleaaase more of this , the format is brilliant
i m just loving it
Really amazing content. Keep it up!
Amazing.. Bravo 👍👍
Hands down top 2 system design vid on TinyURL on this site.
I'm glad I discovered thia channel!
Simple and yet scalable design compare to others. Thanks a lot...
Learnt a lot of concepts in this one. Thank you so much!!😅
great video, that demonstrate the importance of thinking a bit in advance, before start coding. Eventually we end up with a cache lookup system.
I have some questions...
1) Do you consider validating the URLs? Is there a limitation? What if someone would basically start to use this as a free cache store...
2) Are these tiny URLs are public or do you need the access keys to get to the real one?
3) I am wondering if you could possibly use a simple counter and hash that, instead of the whole URL. That would be faster and the hash would have a great distribution as well.
4) If you have the same hash for the same URL, it would be hard to delete the entry later, since other client has the reference. However, that could be a "prime" feature
Just one word - Perfect. Something I was looking for! Thanks
crazy good explanation
Thanks for the video
waiting new Systems designs ❤❤
Saw this video, watched the rest of your videos, learned a bunch, and came back to say thanks.
I really liked the system design videos. Seems like the algorithm liked it too. Keep up the good work and see you in the next one.
Great video, you have a subscriber! Had a couple of questions about the shortening approaches:
1. On the key-generation approach: What's the rationale behind pre-generating keys? Are you trying to avoid a uniqueness check at creation time? Would the UNIQUE index on an SQL table be too big/slow?
2. On the hashing approach: Does the hash function guarantee equal distribution amongst the buckets? Not sure if picking the first letter out of the hash guarantees that. If not, perhaps re-hashing the hash with a function that guarantees a uniformly random output might do the trick. All this to say that skewed shards might be a big problem.
Amazing simplicity bro, keep rocking
Awesome .. keep going and do more videos about SD ♥
Great video. Super informative.
Appreciate you!
👌👌
Great video. Couple of questions-
1. You said there shouldn't be more than 1 short urls for a single long url. So do you check in the database if a long url already has a short url or does the "hash" function takes care of it?
2. Do you just return the same short url for a different person if he has requested for long url already created by a previous person? If yes then why do you store the userId in the table
yes! great questions. 1. In theory the hashing technique (or the random keygen + dedicated db technique) should always create unique hashes so there should be no collisions. But it's cheap to just double check the DB and make sure no shortURL already exists so there are no duplicates/collisions. So we can just go ahead and do that as well.
2. That's a great point, I honestly hadn't really considered. I think for user experience you would want to give person 2 a fresh unique tinyURL (especially if they are requesting a custom tiny url). So there would be 2 different entries in the database where the keys are different tiny hashes, but the values both include the same long URL. so to your point, yeah the userID might not be necessary
@@codetour Thanks for the reply. Looking forward to more videos on system design questions. Cheers 🍻
subscribed👍
I think you should have expanded on how to compute that tinyurl. Its more relevant than explaining the lru which was very superficial
Great video. Principal Engineer here, learned a thing or two and you touched on just enough. I liked the stats, however depending on how popular this service is these things would change a bit. An idea for more content would be to break these up into something like a hobby project, medium size (whatever that means, and enterprise level (what you displayed here).
Regardless loved it!
@Codetour very clear explaination. Could you share which tool you are using for drawing?
When talking about choosing between RDS vs NoSQL, IMO I was a little uncertain when it mentioned strong (RDS) vs eventual consistency (NoSQL). To RDS with a single instance, it might be confident to state that it can align with strong consistency, but when comes with replication nodes, RDS may also not guarantee consistency
Great presentation ! One question though, how would you determine in advance which links are considered "hot" ?
Hi there, I really loved the way you explain, excellent video!
Do you mind to share the tools you used to record this video and create this presentation?
Very useful! 🙌👨💻
The estimation of cache size would be good enough? I think we should estimate cache size based on shorten urls that are more getting heated then others url. So, If it's only 20% of shorten urls that are created with in a month, we can roughly expect that it would be ((500 bytes * 100 mil) / 100) * 20 = around 9 GB.
Subscribed 😅😅
Food for thought: What if we did not want the tiny url to be stored forever? Say we want it to be available for a short span of say 2 hours? What's the approach?
Thank you
Thanks for the great video! It covers many important points. However, I think the SQL vs. NoSQL explanation isn't entirely accurate. Eventual consistency isn't exclusive to NoSQL databases; both relational (SQL) and NoSQL databases can typically be configured to replicate writes either synchronously or asynchronously. The ACID properties relate to transaction management and do not address data consistency across database replicas. Word "Consistency" in case of ACID means "DB ensures that a transaction can only bring the database from one consistent state to another, preserving database invariants like key uniqueness etc"
Hi
In article with base64 conversion there is a problem to address.
We have a counter that is shared across servers, so that counter is critical section and race condition might arrive as diff servers can read same value and generate same short-url.
We must ensure mutual exclucion in that case.
Correct me if i am wrong :)
where does the 70GB cache storage come from? Given 60TB total storage, wouldn't 20% of 60TB be around 12 TB?
what's the name of the whiteboard
seems pretty cool
Photoshop
Super mindful video, you better not get lost in any section of the video or you'll end up like, is this gibberish? 😂
Awesome explanation 🎉
in 5:07, I don't think that the API key should be needed for the redirection endpoint
Crazy to think how much there is to even a simple seeming thing and then you realize there is still so much missing like authentication, authorization, payment handling, plans, email confirmations, internationalization of the website, possibly rate limiting for non paying customers…
I think a more realistic solution would use cloud providers and their services which for this case is even more simple with some KV storage and serverless functions
"slightly inspired" by Sandeep's article 🤔
Is there any reason for the assumption of 200:1 read: create ratio? If, then please explain.
No particular reason for those specific numbers. This is a napkin calculation so the idea is to say “here we’ll get way more reads than writes”. for this exercise it wouldn’t make much difference if the ratio was 400:1 or 1000:50 etc.
So, you just make up those numbers and define your own requirements and through a bunch of servers to do these things.
I don't understand why people take system design interview. that's just seem to me common sense to load balance your requirements. Any software dev should know them.
can people really do all this math in their head that quick?
How exactly will I earn money with that scale? 😮
Service should also provide some statistics, so users know if they draw any traffic and when. Add these to your calculations - data + load
You said 100 years - where is expiry date? How often do you check when to delete?
100 years idea is just stupid overkill, stretching whole budget, give few years or check traffic to each link and if something seems not used then delete it.
Oh rdbms? You must be dinausaur or reptile
what's with the holding your mic trend 😂
Using a microphone improves the quality of your audio much more than for example just recording straight into your phone/camera
@@codetour That's not what I meant. I get the point of using a good mic. But why not just put it on your shirt instead of holding it in your hand 😉
the sound quality is actually slightly better when the mic head doesn't rub against fabric@@blizzy78
You're a surface-level scammer.
Tell me more
@@codetour I'll be honest with you. I've been shadowed-banned many times, so I just assume all of the comments I post will never be seen.
First shorten it. You are talking something else.😅
Intuitively you are correct my friend but if you ever go Into an interview it’s important to clarify why, and how, you are shortening it first. Always the first thing you should do is ask clarifying questions