How to implement TinyURL (System Design Interview)

Поделиться
HTML-код
  • Опубликовано: 29 июн 2024
  • In this video I explain the system design theory behind a possible TinyURL implementation in the context of a technical software engineering interview.
    ▶ Get my FREE Data Structures Crash Course: bit.ly/2YWCn1C
    I start by helping you understand what the problem is asking for, then I will go over the pseudo-code/logic for a solution, and finally I will write the solution code explaining each line, step-by-step.
    Using my simple and easy-to-understand methodology, I make difficult and complex interview questions incredibly easy to understand. So easy in fact, that even your grandma could do it! 👵🏻
    I will be solving a new problem EVERY WEEK, so be sure to smash that subscribe button and hit the notification bell so you don't miss a single one! 😀
    -----
    I also offer full-length interview preparation courses, programming tutorials, and even 1-1 tutoring!
    ▶ Get access to MY COMPLETE SYSTEM: kaeducation.com
    Who knew acing your next software engineering interview could be so incredibly easy!
    -----
    #systemdesign #codinginterview #tinyurl

Комментарии • 116

  • @malcolmdinz1912
    @malcolmdinz1912 3 года назад +13

    Best TinyURL design video I've found, and some concepts here can be applied to other design questions too. Thanks for making this!

  • @PallNPrash
    @PallNPrash 3 года назад +2

    Excellent video!! You packed everything of importance in a nice, short video. GREAT job!! And THANK YOU!!

  • @cynthia7000
    @cynthia7000 3 года назад +18

    Good quality! Would love to see more of these system design videos!!!

  • @alexbordon8886
    @alexbordon8886 2 года назад +5

    omg this is the best system design tutorial for tinyURL !!! Each minute hits the point instead of bullshit. 赞赞赞

  • @mariezhang6818
    @mariezhang6818 4 года назад +2

    Best TinyURL design video! Good pace.

  • @rajathrao3209
    @rajathrao3209 2 года назад +1

    Would love to see more of these system design videos!!!

  • @RandomShowerThoughts
    @RandomShowerThoughts Год назад

    this was very good, clean and concise. Really like this video, especially the logic as to not pick sql

  • @neosapien247
    @neosapien247 4 года назад +1

    This is the best video on this particular topic.

  • @d86clot
    @d86clot 4 года назад

    Amazing video. Thanks for do it short and clear

  • @vaibhav_cs
    @vaibhav_cs 7 месяцев назад

    Excellent approach and proper delivery of content !

  • @tanujaphadke8219
    @tanujaphadke8219 3 года назад

    this is the BEST video on this topic

  • @yuanyuanweng9784
    @yuanyuanweng9784 4 года назад

    Thanks! best system design video. Waiting for more video

  • @bhavana1711
    @bhavana1711 4 года назад +5

    This is the best video with example solution for this problem, keep it up!

  • @AnangPhatak
    @AnangPhatak 7 месяцев назад +1

    Please make more videos on systems design. The content here is presented succinctly with non-complex simple diagrams and licid explanation.

  • @rohinirenduchintala
    @rohinirenduchintala 11 месяцев назад +2

    The best comprehensive tinyURL video for system design i've seen so far. Please make more!

  • @coop4476
    @coop4476 4 года назад +1

    This is excellent! Thank you!

  • @arjunsankarlal5296
    @arjunsankarlal5296 3 года назад

    Well explained!

  • @BestURLShortenerBioPageQRCode
    @BestURLShortenerBioPageQRCode 9 месяцев назад

    Thank you for sharing this much of information.👍👍

  • @tenaciousbali
    @tenaciousbali 3 месяца назад

    Comprehensive explanation and good pointers. Loved this!!!

    • @umeshhbhat
      @umeshhbhat 3 месяца назад

      maine be yahi bataya tha

  • @sindhumohana6164
    @sindhumohana6164 3 года назад

    Best video on this Topic !

  • @jamiepearcey9335
    @jamiepearcey9335 3 года назад

    Great video explanation.

  • @talktovideos4349
    @talktovideos4349 Год назад

    Thanks a lot for making this video

  • @seeboonsoo
    @seeboonsoo 4 месяца назад

    No one explain better than this guy!!

  • @studyiq5015
    @studyiq5015 2 года назад

    Great explanation Plz post more system design videos Thanks :)

  • @deep8843
    @deep8843 Месяц назад

    Really helpful!

  • @gsb22
    @gsb22 2 года назад +13

    @13:11 Everything was going, you even mentioned security issue of using counter making the url guessable and then you said, we use 10-15 bit to add randomness. Now we cannot add bit that will make this number a bigger number because zookeper might reach that range when it starting from 1million. 2nd, lets say, we add random character to it, then the base62 encoding wont be 7 char long and then you have to take first 7 chars which WILL result in collision sometime in future.

    • @ericfries7229
      @ericfries7229 2 года назад

      Convert the number from Zookeeper and the random bits to strings, then concatenate the strings instead of adding ints.

    • @Maffeos
      @Maffeos 2 года назад +2

      ​@@ericfries7229 Could you elaborate? From what I understand, if we concatenate the number with the random bits, then the resulting string would be more than 7 chars. Say the new length is 10, base62 encoding would give another 10 char string, which can still result in collision, no?

    • @alperkocatas2353
      @alperkocatas2353 3 месяца назад

      Up. Same concern here... Looks like we need another method for randomness.

  • @georgesmith9178
    @georgesmith9178 3 года назад

    Thumbs up, thank you.

  • @eleanorwang6226
    @eleanorwang6226 3 года назад

    Very clear illustration! Thank you

  • @DK-ox7ze
    @DK-ox7ze Год назад +7

    If we add 10-15 bits at the end of counter number then it will increase the base62 output size and exceed the 7 character limit. So can you clarify how that addition of random string works?

    • @jinushaun
      @jinushaun 9 месяцев назад

      TLDR: md5(“123”) vs md5(“123,xyz”)
      The original solution hashed the returned counter value, which is a simple increasing number. The resulting hash is then base62 encoded.
      The new solution takes the counter number and appends some extra characters at the end. This new string is then hashed and base62 encoded.
      In this way, the string sent into hash is not guessable. This works because the numeric portion is still guaranteed not to collide.

    • @DK-ox7ze
      @DK-ox7ze 8 месяцев назад +1

      But in this new solution also, there is a chance of collision because we are hashing the counter+random_chars and limiting the hash to a short length.

    • @chirut4327
      @chirut4327 8 месяцев назад

      Even without appending extra digits, base62 does not guarantees 7 chars. That simply means, we are not confining ourselves to 7 chars. We probably need to tell the interviewer that we would get as small as possible hash but no guarantees. This is a possible tradeoff to avoid collision.

  • @hyrdeshgangwar
    @hyrdeshgangwar Год назад

    Good Stuff!

  • @fannyzheng1077
    @fannyzheng1077 4 месяца назад

    Thank you

  • @mrowox
    @mrowox 2 года назад

    Concerning the round robin approach to load balancing not taking server load into consideration, such that it might still forward request to an overloaded or slow server, I believe that due to the style of this approach, if one server is overloaded, then all the servers are possibly overloaded else seeing that it would have been distributing the loads equally except if the servers are not of equal capacity which will then bring the question of why will you spin up a server of lesser capacity

  • @kavyapallari2902
    @kavyapallari2902 2 года назад

    Good video

  • @mohamedelidrissi810
    @mohamedelidrissi810 Год назад

    For caching, will it be better to have some kind of background job that populates the cache with the most popular URLs from the database? Else if you're always adding a URL to the cache, then it's no longer just the popular URLs but all of them (or at least the capacity of the cache).

  • @narindersharma303
    @narindersharma303 2 года назад +2

    if we have to base60 the #, then why not precompute it and distribute it to servers, i.e. S1 will keep the hash60 of 0-1M , S2 will keep the hash from 1M-2M and so on.

  • @bayareacarnatic
    @bayareacarnatic Год назад +1

    If you are using a unique counter why do you still need md5 and base62 encoding.? And then you need additional complexity like integrate zoo keeper etc just to maintain counters.

    • @boyangzheng269
      @boyangzheng269 3 месяца назад

      md5 is not required but base62 encoding is otherwise image how long the url could be when the count increases to 1 trillion.

  • @FlashFreelancers
    @FlashFreelancers 16 дней назад

    How do we identify the same long url which got converted to tiny url already ? Since we are using counter each time it will get a new tiny url no ?

  • @GauravSharma-wb9se
    @GauravSharma-wb9se 2 года назад +2

    does Base62Encoding of counter guarantee to return only 6 characters or 7 characters string ? if this is the case then when we do Base62Encoding on MD5 hashed value, in that case also it should return 6 or 7 characters of string...…please clear this doubt

    • @chirut4327
      @chirut4327 8 месяцев назад

      Believe me brother, none of the guys were able to clearly explain this on RUclips/Books/Blogs. Base62 does not guarantee in terms of result length. It generally increases in this case though.

  • @fjeld11
    @fjeld11 4 года назад +3

    Thanks for the excellent video. A question, in solution 2 with zookeeper, how does each server records the range it was given and when the range exhausted? Is there a counter on each server?

    • @ketanmalwa4580
      @ketanmalwa4580 3 года назад

      That logic can reside on the zookeeper service itself. For instance, the coordination service can use a custom hashing function based on the incoming request and forward it to specific servers depending on the range value.

    • @fjeld11
      @fjeld11 3 года назад

      @@ketanmalwa4580 Got it, thanks a lot!

  • @jochenjochen4246
    @jochenjochen4246 2 года назад

    i am curious why 128 bit goes to about 20 chars by using MD5 + base 64/62. 128bit to 32 char in hex. but what next? thanks in advance

  • @Yoyomanmanholla
    @Yoyomanmanholla 2 года назад +1

    LRU cache eviction has a big short coming. If we assume the cache is always at capacity with each new shortURL creation, then that means one of the top 20% URL will surely get evicted from cache, to make room for a random unpopular URL someone has submitted.
    This eviction approach means there will be a steady state of URLs in the cache that are not popular at all.

    • @boyangzheng269
      @boyangzheng269 3 месяца назад

      LRU is used to maintain the top 20%, so the usage count also matters.

  • @chilly2171
    @chilly2171 2 года назад +1

    what if the custom url entered by the user already exists in the database? it results in duplication.

  • @funnyclipz520
    @funnyclipz520 Год назад

    why can't the database (mongodb) itself generate an auto increment ID and then server can generate MD5 or base62 after adding salt to it ?
    Since db upon creation would never fail to generate a unique primary key it would avoid single point of faliure.
    Making a distributed solution for generating ID seems overkill here..
    I am new to this so let me know if there is a problem in this solution.

  • @mrowox
    @mrowox 2 года назад

    You mentioned zookeeper. what is the number of servers is not stable, could be 8 today, then 5 tomorrow, then 10 next tomorrow. How does zookeeper manage those ranges based on thee number of servers available. I really need to understand this. Please help

  • @tgiflying
    @tgiflying 2 года назад +1

    Why do MD5 or SHA256? Just pick 7 characters at random with replacement from the set of base62 symbols

  • @renguomin1
    @renguomin1 2 года назад +1

    Why introducing a counter guarantees you no collisions?

  • @michelmdf
    @michelmdf 3 месяца назад

    Why not just rely on a database cluster to generate a sequencial numeric id , and just persist the id and the long url ? So when a request comes in, we convert from base62 to decimal and find the record in the db. What is wrong of this idea ?

  • @Freez2018
    @Freez2018 2 года назад

    Hey Kevin, should we wait for new system design videos?

  • @dbenbasa
    @dbenbasa 4 года назад +2

    question about using counters - isn't that mean that if the same user asks to shorten the same URL multiple times, he will get multiple short URLs? Is it ok from a requirement standpoint?

    • @PABJEEGamer
      @PABJEEGamer 3 года назад

      Whenever you get a request to shorten URL 1st check should be to see if it already exists either in cache or main DB. If no, then only proceed with the short URL creation.
      Or you could use the insert "IF NOT EXISTS" feature provided by no sql solution like cassandra.

    • @gsb22
      @gsb22 2 года назад +1

      You cant return same url ever again.

  • @ubaidmanzoorwani6254
    @ubaidmanzoorwani6254 2 года назад

    Why do we have to use base62enode after MD5???
    Why is MD5 not enough??

  • @johnhuynh9445
    @johnhuynh9445 4 года назад

    Let's say the requirements change and we want a unique_url every time the same long_url is entered. (Use Case: you want to conduct analytics on the different entry points to the long_url site). How would we accommodate unique short_urls for each successive insert of the same long_url?

    • @jamiepearcey9335
      @jamiepearcey9335 3 года назад

      This design already allows us to associate the same long URL with one, or more short URLS.
      To prevent the same long URL being associated to multiple short URLS the service would first have to check if the long URLS is already in use, and if so return the existing URL, which isn't something that has been described in the requirements or implementation detail.

  • @ashwinnatty
    @ashwinnatty 4 года назад +5

    Hi. Good video. I have a question though. If we do a base 62 of smaller numbers like 0, 1, 2 ... we wont be getting a 7 char long tiny URL right? So how do we make sure all our short URL's are 7 char long? Kindly clarify

    • @p516entropy
      @p516entropy 4 года назад

      base62("000007") -> "F2nt9Syt"; base62("000023") -> "F2nt9T75"

    • @ashwinnatty
      @ashwinnatty 4 года назад

      @@p516entropy : Hi. Thanks for responding. But now consider a bigger number , say 1000000, it's base 62 -> 11PVWGSpX6 which is greater than 7 characters. I am still unable to understand how to fix your short url using counter method to exactly 7 characters

    • @p516entropy
      @p516entropy 4 года назад +1

      ​@@ashwinnatty Oh sorry I was hurry and now I understood, here is base62 algorithm has the same approach as base2 (000, 001, 010, 011, 100, 101) or base16 (0000... FFFF), but now that is base62. (0 000000
      ,
      1 0000001
      ,
      2 0000002
      ,
      3 0000003,
      4 0000004,
      59 000000X
      ,
      60 000000Y
      ,
      61 000000Z
      ,
      62 0000010
      ,
      63 0000011
      ,
      64 0000012
      ,
      65 0000013,
      56_800_235_583 0ZZZZZZ
      ,
      56_800_235_584 1000000,
      3_521_614_606_207 ZZZZZZZ). And indeed, as you can see no any collisions

    • @journeytoraceday
      @journeytoraceday 3 года назад +2

      @@p516entropy sorry, still don't understand this. base62(101020)=FMA4abRo and base62(101021)=FMA4abRp. If we take the first 7 chars from each, we get a collision, "FMA4abR"

    • @gurumack
      @gurumack 3 года назад

      @@journeytoraceday I have the same question...

  • @benjaminwestphal9685
    @benjaminwestphal9685 Год назад

    How does your hashing work o.O - you take an 128 bit md5, and then make a base62 out of it with 20+ chars. How are you sure there are not collisions when only taking a subset of the base62 aka md5 hash(just differently displayed)? o.O

  • @nicholasmanning4307
    @nicholasmanning4307 4 года назад +2

    Nosql is in no way related to a database not excelling at joins. Graph databases excel at joins and relationships and they are categorized as nosql

  • @lucianomonterovidela
    @lucianomonterovidela 3 года назад

    One question, If we take the first 7 characters even with numbers differents, we can have colission. What I didnt undestand?

    • @gsb22
      @gsb22 2 года назад

      With number that is less than 3.5 trillion, it would always result in a unique base 62 string, but the issue here is,, it would be guessable, because if 1 million -> xyz001 and 1million 1 would be xyz002. In order to tackle that, we could add random characters to the number which will result in base 62 string with len > 7 and now it is an issue.

  • @brownbearnishant
    @brownbearnishant 3 месяца назад

    do more videos brother

  • @cbest3678
    @cbest3678 3 года назад +1

    Can someone explain why zookeeper itself is not single point of failure in this case?

    • @jamiepearcey9335
      @jamiepearcey9335 3 года назад +2

      Zookeeper provides increased availability through redundancy by operating in a multi-node cluster configuration. Multiple zookeeper nodes would have to go down for service interruption. As the URL shorting service reserves a range of values rather than a single value, the number of roundtrips to this counter reservation system is also reduced, which in turn also helps to reduce system load, network traffic, and the number of services involved in handling each reservation. These all help to increase availability, and can also help to increase performance and reduce running costs. You can increase availability further by spanning your nodes and services across separate datacenters, to avoid your infrastructure becoming a single point of failure.

  • @niteshupadhyaya007
    @niteshupadhyaya007 Год назад

    The counter example is incorrect - Another solution could be to append the user id (which should be unique) to the input URL. However, if the user has not signed in, we would have to ask the user to choose a uniqueness key. Even after this, if we have a conflict, we have to keep generating a key until we get a unique one.

    • @spacepacehut3265
      @spacepacehut3265 7 месяцев назад

      it isn't incorrect because the counter itself will be unique everytime even if the random bits will be same. And so does the generated hash will be unique for every url.

  • @shamim520778774
    @shamim520778774 4 года назад +1

    what else can we use instead of zookeeper ?

    • @abhinavprakash3324
      @abhinavprakash3324 3 года назад +1

      etcd is a similar strongly consistent, highly available, key-value store.

  • @sengo669
    @sengo669 3 года назад

    Base62(100000000000) + another 10 to 15 bit number would be super long. Do you think about it?

    • @sengo669
      @sengo669 3 года назад

      So our goal is to generate a short URL not to make a longer URL.

    • @jamiepearcey9335
      @jamiepearcey9335 3 года назад

      Base62 should be considerably shorter than base 10, and the value you describe will approximate to 7 characters. Base64 is considerably shorter still, but the extra two characters might not be suitable for urls! You may be attempting to convert a string of numbers to base62 which would could create a longer base64 value, as you would be starting off in a higher base unit.

  • @kumarmanish9046
    @kumarmanish9046 2 года назад

    What about database? He never talked about how to store in db

  • @mayurkoli4145
    @mayurkoli4145 4 года назад +1

    Well the lecture is great i have a doubt, MD5 is generating unique hash each time so we can grab first 7 char of that hash, then why we need to convert it in base62.?

    • @RahulJangra9
      @RahulJangra9 3 года назад

      Please let us know if you get the answer to this ques

    • @RahulJangra9
      @RahulJangra9 3 года назад +3

      ok, got it. MD5 generates 128 bit output which is generally represented in hexadecimal, if you will take first 7 char of this string made up of hexadecimal you will not be able to make many combinations. only 16^7 combinations. However if you use base64 encoded string we you will able to have many combinations in 7 char i.e 64^7

    • @mayurkoli4145
      @mayurkoli4145 3 года назад

      @@RahulJangra9 got it man thanks u✌️✌️☺️

    • @gsb22
      @gsb22 2 года назад +1

      beware this WILL result in collision if two users use the same url.

  • @kakakukukakakuku
    @kakakukukakakuku 3 года назад +1

    A good candidate would be the person who has watched this video :-)

  • @chirut4327
    @chirut4327 8 месяцев назад

    base62 encoding can result in a string of any length. And we are not supposed to take first 7 chars to avoid collision. So that means we take what ever output we get from base64.
    This is what I want to hear from these videos, but believe me none of them emphasize on this. They just say blabla and use base62.

    • @boyangzheng269
      @boyangzheng269 3 месяца назад

      We are not encoding the original URL because as u said it would result in a string of any length. Instead we are encoding the numbers (0 - 3.5 trillion) which would not exceed 7 chars because 62^7 > 3.5 trillion.

  • @dhruvdnar
    @dhruvdnar 2 года назад +1

    You rushed through some important aspects. EG: Why would base 62 give 21-22 chars? Its because Base encoding encodes 6 bits at a time. 128/6 = 21

  • @hamzasanwar
    @hamzasanwar Год назад +1

    Why base 62 not base 64

    • @abjbreal
      @abjbreal Месяц назад

      52 alphabets and 10 digits = 62 total characters

  • @bostonlights2749
    @bostonlights2749 3 года назад

    I like your accent

  • @stiffyBlicky
    @stiffyBlicky 2 года назад

    This video is inaccurate base62encode(0) is not aAbB123

  • @NK-ju6ns
    @NK-ju6ns 3 года назад

    aka :)

  • @chriszeng5406
    @chriszeng5406 4 года назад +1

    listening u talk gives me the biggest anxiety. but good video still.

  • @mtung05
    @mtung05 3 года назад +8

    This is a straight rip off from Tushar Roy tinyUrl video

    • @elfchosen1477
      @elfchosen1477 3 года назад +1

      I don't think so. This one is much better explained than Tushar Roy's one.

    • @gsb22
      @gsb22 2 года назад

      somewhat.

  • @9939364566
    @9939364566 6 месяцев назад

    Amazing way of explanation.... I watched many YT videos and they all confused me. You nailed it ✌️ Tysm