Tiny URL - System Design Interview Question (URL shortener)

Поделиться
HTML-код
  • Опубликовано: 27 ноя 2024

Комментарии • 154

  • @TechPrepYT
    @TechPrepYT  12 часов назад

    🔍 Full Write Up + Bonus Section (What Top Tech Interviewers Really Want to See) → www.techprep.app/system-design 🎯

  • @mohisham
    @mohisham 9 месяцев назад +165

    Nice video. One correction: when you're showing the number of possible URL's possible (at around 2:45 mark), you have shown it as 1^62, 2^62 .. etc. when it should be 62^1, 62^2 ... 62^7

    • @TechPrepYT
      @TechPrepYT  9 месяцев назад +15

      Great spot thanks for that!

    • @reidchatham
      @reidchatham 9 месяцев назад +11

      Lol yeah, 1^62 = 1

    • @EmmanuelOdii80
      @EmmanuelOdii80 9 месяцев назад +1

      Yeah, that's very correct.
      So, it's now like 62⁷! , so that we get the total number of ways it can be written.
      Still some insane number.

    • @saviorodrigues4248
      @saviorodrigues4248 5 месяцев назад +3

      yeah this just confused me lol

  • @TechPrepYT
    @TechPrepYT  9 месяцев назад +184

    Correction: Number of possible URLs (2:45 mark) should be:
    - 62^1
    - 62^2
    ...
    - 62^6
    - 62^7
    SO to @mohisham for spotting!

    • @GeorgeDumitra
      @GeorgeDumitra 9 месяцев назад

      the way to calculate this, in my eyes, work like this. I will take a simple example: 3 characters, a,b,c (instead of 62), each URL will have 2 characters.
      We have (ab) (ac) (ba) (bc) (ca) (cb). This is basically Arrangements of 3 taken by 2 => A(3,2) = 3!/(3-2)! = 6
      If we accpept also combinations like (aa) (bb) (cc) then we will add also those 3 unique possibilities meaning we have 9 possible arrangements, thus we have the formula:
      Total URL number=> A(n,k) + n
      Where in your case n=62 and k=1...7.

    • @querela92
      @querela92 8 месяцев назад +4

      ?
      Isn't it simpler with 62^n meaning 62 * 62 * ... * 62, with n times 62. As an example, for n letters I have 62 choices, for two letter 62 * 62, so 62 choices for the first letter and for each letter 62 choices for the next letter, so we multiply which is in short 62^2. And that can be extended for any n.

    • @GeorgeDumitra
      @GeorgeDumitra 8 месяцев назад

      @@querela92 "Isn't it simpler...". Hmm..you want simpler or precise?
      "And that can be extended for any n"...actually it can`t.
      Let`s take a larger number and let`s see where your mistake is:
      We have 62 elements that will be taken by 7.
      Your calculus will be: 62^7 = 3.521.614.606.208
      But in reality it will be 62 + A(62,7) = 2.478.652.606.142
      This means you will end up paying for a space, transactions, cache, etc on a server larger that your really need with: 1.042.962.000.066.
      This is about 1.04 TB of data you will never need.
      This is the difference between mathematics and "let`s make it simple".

    • @sid007ashish
      @sid007ashish 8 месяцев назад

      Exactly what I was thinking, thanks for the correction.

  • @felipe2637
    @felipe2637 8 месяцев назад +5

    i am not at a point in my career where i need to be an expert in system design, but by following such high quality content, i believe I'll definitely be an expert when the time comes. Amazing video, and thank you for providing this content for free!

  • @georgeadamou7020
    @georgeadamou7020 9 месяцев назад +6

    I believe the specific problem is a perfect match for the Actor model. The way that I would solve this is going via one of the popular actor frameworks (akka or orleans). Both of these frameworks provide out of the box the functionality that you get from the load balancer, zookepper and distributed cache. All you need is you actor cluster and a persistent storage. The persistent storage can be a database (SQL or NoSQL) or a cloud storage account (e.g. aws bucket, azure blob etc.). i believe this approach will reduce the operational complexity and cost significantly while achieving similar results in terms of scalability and fault tolerance.

  • @KnightSwordAG
    @KnightSwordAG 9 месяцев назад +42

    I really like the idea of a zookeeper to keep track of the short url identifiers. I have never thought of it. And now it's mine. :D

    • @anuragbisht1200
      @anuragbisht1200 6 месяцев назад

      etcd can be another one, simpler than zookeeper.

  • @superhoops540
    @superhoops540 9 месяцев назад +55

    it seems a bit overcomplicated to have to reach out to some service to get some count to generate a URL. There's a 1 in 3 trillion chance that some random shortened url would collide. I think it's reasonable to just generate the url at random and try to save it. if it exists in the db just try again. The user can wait that extra amount of time that the server needs to re-generate the url in the tiny chance that case arises

    • @DanielSimon1995
      @DanielSimon1995 9 месяцев назад +28

      That, or pregenerate x urls in a shared queue and just pop one when needed..

    • @awepicness
      @awepicness 8 месяцев назад +11

      that chance increases the more urls are stored. After only a year you'll have 31B urls stored. If you continue to generate a URL at random, you'll then have a 1/100 chance of a collision.

    • @kyle8575
      @kyle8575 8 месяцев назад +2

      @@awepicness This is also assuming urls don't expire.

    • @jporritt
      @jporritt 8 месяцев назад

      I suggest you look into the Birthday Paradox

    • @JochemKuijpers
      @JochemKuijpers 8 месяцев назад +8

      Just adding a single character in length drops that probability to 1/620, random URLs definitely seem to be the way to go here, simplifying the design. That's a 4E-9 probability that you won't be able to insert a URL within three attempts, basically negligible. You can opt to fail the request after 3 failed insertion attempts and it wouldn't measurably impact your service's reliability.

  • @albanm8430
    @albanm8430 9 месяцев назад +9

    Nice, I wasn't aware about the zookeeper component. Thanks for that! keep up with this good content!

  • @0MrJay0
    @0MrJay0 6 месяцев назад

    This is the best TinyURL system design video. It's so clear and straight forward. keep up with this good content!!!

  • @MichaelPetito
    @MichaelPetito 9 месяцев назад +9

    Zookeeper seems redundant given a database sequence would have accomplished the same and would be a miniscule load compared to the R/W activity for the urls themselves.
    This explanation also glosses over the non-functional requirements for HA and low latency, since the concern for geographically distributed and redundant deployments was not addressed.

  • @BarakaAndrew
    @BarakaAndrew 9 месяцев назад +39

    instead of zookeeper why not simply have the distributed servers to pick the starting letter, e.g. server 1 can pick tiny urls beginning with 0-9, server 2 urls beginning with a-f, server 3 url g-m e.t.c that way u don't have to deal with having another point of failure, if one of those servers die it's fine you'll simply not produce a certain range until that server comes back up.

    • @mariusheller8231
      @mariusheller8231 9 месяцев назад +8

      What happens if you want to add additional servers at a later point?

    • @NubeBuster
      @NubeBuster 9 месяцев назад

      ​@@mariusheller8231do the first two or more. You can update the existing servers to then also enforce the longer prefix

    • @marcialabrahantes3369
      @marcialabrahantes3369 9 месяцев назад

      only scale those servers?​@@mariusheller8231

    • @vikingthedude
      @vikingthedude 9 месяцев назад +4

      I like this, but wouldn’t this mean the state of “last generated url” is stored in-memory in the application servers? What happens when the servers are redeployed? That state would be wiped out and there’s no way for each server to resume from where it left off

    • @michawhite7613
      @michawhite7613 9 месяцев назад

      @@mariusheller8231Presumably you'd have a way to change the ranges for each server later.

  • @bigodines
    @bigodines Месяц назад

    2:11 Don't overlook these estimates folks. this is likely the main reason you are being asked this question. There are lots of traps if you just pick a random hashing function and cut parts of it at that volume. The rest of the question is prett straightforward

  • @configadd
    @configadd 8 месяцев назад +2

    Besides of service failure, there are regular service updates during development of new features, which will kill all instances of replicated service and invalidate potentially large amount of address ranges. Personally, I prefer using hashing algorithms for such tasks, dealing with collisions on writes, since lower latencies typically needed only during reads and reads are much more frequent. For example, you can calculate hash of incoming long URL and cut it to required length, which is secure, pretty robust and prevents duplicates in database.

    • @degosoft
      @degosoft 8 месяцев назад

      Once you cut up the hash to the required length there is no longer any guarantee that it's unique. We are talking about a 1000 new URL's per second so "pretty robust" isn't going to cut it.
      Dropping that many keys seems like a strange solution, but even if you reboot a 1000 servers every day for a year, you still have more then enough keys left for years to come. But if it could be a problem, you could also ask for a range of 1 to 100.000 and lose a lot less. Also keep in mind that you don't actually lose the entire range, because some have already been used. You could also schedule a server to go down instead of just dropping it. That it will reboot once all keys are release. But still, I don't think it's really an issue to lose those keys.
      I do agree with the latency problem mostly being a factor on reads. That is when a user manually creates a new URL and has to wait a second after clicking post. But for generating links via an API the latency is important both on insert and read.

    • @configadd
      @configadd 8 месяцев назад

      @@degosoft With hash there is no uniqueness guarantee anyway, this is only about collision chance and key distribution. That's why I mentioned collision management. So, by "pretty robust" I mean "robust enough for the most cases". Of course, there is no silver bullet, all solutions have pros and cons. I'm just talking the way I prefer to solve the problem in real-world applications. And there is one more thing about using tiny URLs in real-world applications: they often not intended to live forever. In some cases, shorter URL with lifetime, for example, one year, may better suit your needs. Since the information at the source URL will no longer be valuable at that point. In this case, reclaiming space with hashed URLs will be a bit easier, just the matter of removing old entries from database by creation date. Another advantage to this approach is ability to set any tinyURL length at any time, even by request parameter. And, of course, absence of failproof zookeeper cluster reduces structural complexity of the system.

    • @degosoft
      @degosoft 8 месяцев назад

      @@configaddNothing is guaranteed, but the chances for a sha256 collision are so incredibly small that it's practically none existent. And the chances get even smaller when we only use URL's as the input. But if you discard 25 of the 32 bytes in the output to get to a 7 byte shortcode, then you increase the odds by a lot. And bytes aren't even useful, so we need to encode the output bytes to something readable first which makes the default length even longer and thus you drop even more precision.
      Don't get me wrong, hashing is great and I've used it a lot for similar projects. But if "SHORT" is te requirement then hashing is not the way to go.

  • @kennetemerson
    @kennetemerson 8 месяцев назад

    thank you for the content, I remember going through that problem in a interview when I was a junior.

  • @saxybrian
    @saxybrian 9 месяцев назад

    Interesting enough I got asked this a few weeks ago in an interview.

  • @prashantsalgaocar
    @prashantsalgaocar 5 месяцев назад

    Thanks much for the design. Easy to understand and not much complexity in solving what the end user really needs..

    • @TechPrepYT
      @TechPrepYT  5 месяцев назад

      Thanks!! That's exactly what the aim is 👍

  • @radutudorica8835
    @radutudorica8835 8 месяцев назад +1

    if browser redirection is the only requirement, why not just keep all shortened urls as files on a CDN and expose that (you could have just one js line that tells the browser to change location and if you want analytics you could just add another few lines to make a request to the analytics service), by picking a wide enough range you don't really need to worry about uniqueness and you can pick the short urls at random and not check if it's already in use

  • @mhadi-dev
    @mhadi-dev 6 месяцев назад +2

    Two suggestions:
    1. Instead of Counter or Zookeeper, you can use a hash function that takes the URL and produce a 7 chars string. Thus don't need to solve the problem of counters and Zookeepers and it's more secure.
    2. An additional point from the interviewer might be to Add the ability to use custom short URLs. This is really a good point to discuss

    • @TechPrepYT
      @TechPrepYT  5 месяцев назад +2

      Great suggestions!

    • @sixoclock406
      @sixoclock406 5 месяцев назад +2

      How would you handle collisions?

    • @mhadi-dev
      @mhadi-dev 5 месяцев назад

      @@sixoclock406 For MD5, the probability of just two hashes accidentally colliding is approximately 1.47*10^-29

    • @tejinder83
      @tejinder83 4 месяца назад

      That would not work due to collisions.

    • @mhadi-dev
      @mhadi-dev 4 месяца назад

      @@tejinder83 What's the probability of collisions of a good hashing function? (MD5 for example)

  • @Feedmagoo92
    @Feedmagoo92 9 месяцев назад +11

    Would it make sense at all to derive the shortened URL from the long url (Hashing and salting etc.), it may increase the chances of collision with such short URLs, but it would resolve the countcache issue, because if we use those numbers as salts, it doesn't really matter if the count cache returns the same value twice.
    In fact, if we can get away with a pure hash, you can reduce duplicates in the database, because multiple people shortening the same site always get the same result.

    •  8 месяцев назад

      A salt could help with resistance, but cannot prevent collisions at a reasonable level. So you could never fulfill a request to create a new URL with an accidental collision.

    • @sakonthephamongkhol5767
      @sakonthephamongkhol5767 3 месяца назад

      then how about appending random string to the url before hashing and if it collides then just redo

  • @1fabi0
    @1fabi0 8 месяцев назад +1

    Instead of Zookeeper why not use a hashing algorithm to hash the long url down to the short url, this would not only make the shorturl (nearly) unique, it would also always give the same shortened url for the same long url which avoids duplicate entries in the database

    • @minnow1337
      @minnow1337 8 месяцев назад

      The only way to prevent collisions with the hashing method is to make the url longer which is counter-productive to the goals of the service.

  • @jok5211
    @jok5211 8 месяцев назад +1

    Hashids would def. be a good idea to use :)

  • @michawhite7613
    @michawhite7613 9 месяцев назад +1

    This data doesn't change, so the caching is a lot simpler. In fact, I would forego Redis altogether and just hold a cache on the web servers themselves. That's less computers you need to pay for and fewer points of failure.

    • @drewbrown1534
      @drewbrown1534 9 месяцев назад +4

      This doesn’t work well when scaling very wide. If we had 100 webservers, we’d be caching the data up to 100 times. A centralized cache is a much more effective solution

    • @gustavoalves4324
      @gustavoalves4324 9 месяцев назад +2

      @@drewbrown1534 That makes sense, i think it would also be better to handle the LRU strategy that we may adopt to keep just the most used URLs in cache.

  • @jessegador
    @jessegador 8 месяцев назад

    I think you got the correct answer for getting the number of unique combinations given the number of characters but the formula is wrong. 1^ 62 and 2^62 for example are incorrect formula but the answer is correct.

  • @kenchooooo
    @kenchooooo 9 месяцев назад +1

    Excellent video! Could another single point of failure be each webserver? For example, during a GET request, if the webserver handling the range is down, the client won't be able to get the long url.
    My suggestion is to use a cluster of webservers covering each range. Each cluster would manage itself with a consensus algorithm like Raft or Paxos. That would make each webserver more available while maintaining consistency and resistance to network partitions. This would require more webservers, but it could be argued that since we're using only 7 or so webservers now, having 3-4 webservers in each cluster won't make too much of a difference in cost. We could also increase the range of each cluster, but then we might be sacrificing throughput-it's just a tradeoff between the cost and throughput now.

    • @TechPrepYT
      @TechPrepYT  9 месяцев назад +5

      Great comment!
      A single point of failure (SPOF) is one that causes the entire system to go down. So while you’re correct that if a server goes down while responding to a request, that request will fail. However, it is not a SPOF as the system will still be up, as if the user were to send another request it would simply be routed to a different server.
      Having clusters of servers could be a great extension to the system. However, it does introduce more complexity that could be argued is over engineering the solution without adding a significant improvement to the overall system. But as you already mentioned it’s all about tradeoffs!

    • @JensRoland
      @JensRoland 9 месяцев назад +1

      That just sounds like DynamoDB with extra steps.

    • @Vugoseq
      @Vugoseq 8 месяцев назад

      I think you made the assumption that web servers are limited to their own range for responding to GET short-url requests, but that isn't said anywhere. The zookeeper gives them a range of free IDs to use for shortening urls, but each web server can accept any existing short url and will contact a cache and if not in there, the database to get the long url.
      So, if a web server dies, retrying the request would just be delegated to another web server that will return the correct long url. Even for shortening, if a web server dies and you retry the POST, you'll get a new shortened url from another web server (with an ID from another range)

  • @MLGJuggernautgaming
    @MLGJuggernautgaming 8 месяцев назад +1

    So the zookeeper only serves to assign a range to the server? How does the server track which values in the range are used, by counting?

  • @ure_02
    @ure_02 8 месяцев назад

    Really nice video, will definitely watch more of your videos, keep it up!

  • @row-v8o
    @row-v8o Месяц назад

    what is the best type of database to use for this? relational or non-relational?

  • @kylewollman2239
    @kylewollman2239 9 месяцев назад

    Great explanation. I'm looking forward to more content from your channel!

    • @TechPrepYT
      @TechPrepYT  8 месяцев назад

      Appreciate the support! Could I use this comment on the website?

    • @kylewollman2239
      @kylewollman2239 8 месяцев назад

      @@TechPrepYTYes.

  • @stylishskater92
    @stylishskater92 9 месяцев назад +4

    I dont get the count cache, whats the purpose of it?

    • @DatFarrix
      @DatFarrix 9 месяцев назад +4

      It's not discussed imo really clearly in the video. Here was my guess: If two incoming POST requests come in at the same time, and there are multiple web servers handling requests, it is possible for both to generate the same key and cause a collision. At a TPS of 1000, there is a high likelyhood of collision. Thus, the count cache and the zoo manager are ways to avoid this scenario of collisions between different webservers.

  • @aragonnetje
    @aragonnetje 8 месяцев назад +1

    suggestion for alternative to zookeeper: you could have the create-url endpoint generate a random short URL to match to the long URL, to remove any count cache or zookeeper, so long as the space is big enough there should be no collisions

  • @paulostradioti
    @paulostradioti 3 месяца назад

    Could you please clarify how would the convertion from a base 10 number to a base62 length 7 string be?

  • @luuuizpaulo
    @luuuizpaulo 8 месяцев назад

    Nice video! It would be nice if there were also basic videos explaining, for example, what a zookeeper is. Clicked on subscribe!

  • @germangamboa3421
    @germangamboa3421 8 месяцев назад +1

    I would say for 1k writes per second this is kinda of over kill.
    A single RDS instance could handle this thing without breaking a sweat instead of needing things like a zookeeper.

  • @samgoodwin89
    @samgoodwin89 8 месяцев назад +2

    If you implemented this with zookeeper I would recommend no hire. Way over engineered. Hash the url, salt on collision, store in dynamo and cache with redis.

  • @philipkopylov3058
    @philipkopylov3058 9 месяцев назад +1

    Thanks for the video. Would that be a valid assumption that you don't really need to create short URL during the request? Aka you could run your 7T numbers ahead of user requests and create short URLs and store them waiting for long URL to come and just plop that in? That would solve few points immediately: 1. Faster processing 2. No need to keep the zoo of web servers and maintaining them. 3. You could mix them non-sequentially so that would alleviate the attack issue
    Thanks, just learning here! Curious what you think
    I guess the only point of failure is that your speed of making those short URLs should be faster than your hypothetical speed of using those URLs, though I'd expect that there would be a very big head start initially. I guess this is where you would need to add another web servers

    • @TechPrepYT
      @TechPrepYT  9 месяцев назад +2

      Thanks for the comment!
      It’s an interesting proposition alright. In terms of generating the short URL, that takes a negligible amount of time and so precomputing that wouldn’t have any noticeable impact on performance.
      Not sure I fully understand the second part of the comment. Are you suggesting storing all the short URLs on a single server?
      By having all the short URLs on one server while convenient introduces a single point of failure and so thats why horizontal scaling is recommended. And then, once there is more than one server involved, a service for maintaining distributed synchronisation is typically recommended.
      It’s all about trade offs but typically a single point of failure is something you always want to avoid.

  • @chessmaster856
    @chessmaster856 13 дней назад

    What is the point of useless counting? Are you counting with fingers? Things are supposed to be extensible right?

  • @Noyemie26
    @Noyemie26 5 месяцев назад

    Hi! In the first naive solution, why are you using a load balancer if we only have one web server?

  • @Thinkingfeed
    @Thinkingfeed 9 месяцев назад

    Nice, keep posting please!

  • @MahadiHasan-qb9zw
    @MahadiHasan-qb9zw 8 месяцев назад

    It will be great if we get a tutorial on hand on coding experience tutorial based on this system design . is there any chance will you make ?

    • @TechPrepYT
      @TechPrepYT  8 месяцев назад

      It's on the roadmap!

  • @thisismissem
    @thisismissem 8 месяцев назад +18

    For what it's worth, as a Staff+ software engineer, if you submitted to me a "URL Shortener" task as depending on Zookeeper, you're not be getting the job, since you're Introducing a huge amount of operational complexity which can be avoided by changing the algorithm you use for generating URLs (e.g., you could use a single bit for a server ID value - this is what Snowflake IDs do)

  • @paulostradioti
    @paulostradioti 3 месяца назад

    What about checking if the long URL already exists during a POST request?

  • @agustikj
    @agustikj 8 месяцев назад

    What about hashing the input url and use that to create the short url?
    You could take the md5 sum of content and create a custom encoding for ti using the 64 chars
    md5 is 45 characters in hex and 25 chars in base64. The benefit of using a hash function would be to elimnate the zookeeper

    • @SimDoes
      @SimDoes 8 месяцев назад

      the whole point is to have a short url, 25 chars would be way too long

  • @jameshogge
    @jameshogge 9 месяцев назад +2

    A random suffix is a poor solution to the security aspect. You can have unpredictable URLs with a minor compute overhead simply by encrypting the range-based number to get the actual URL. This will never introduce a collision. You only concern should be publicising the key you use for encryption.

  • @honestduane
    @honestduane 8 месяцев назад +2

    This is a bad design.
    Your teaching people to create bloated systems by adding zookeeper when its not needed.
    And some math is wrong.
    This system is much more bloated and more expensive to maintain than it should be because of the bad design.

  • @fritzfahrmann4730
    @fritzfahrmann4730 8 месяцев назад

    great video, great content :) subscribed
    (but pls fix clicking sounds of laptop hearing on onboard mic)

    • @TechPrepYT
      @TechPrepYT  7 месяцев назад +1

      It's fixed! Thanks for the recommendation!

  • @givrally
    @givrally 8 месяцев назад

    My idea for generating URLs would be to hash the URLs first, convert the hexadecimal output to base 62, and then crop to however many characters you'd want in a URL. Then, servers can check if that short link already exists in the database. If it exists under a different long url (=collision), then add a special character and re-do the hash, remembering to strip that special character later when entering it in the database. Is there any huge flaw in this method ? I'm a data scientist, so this is absolutely not my area of expertise.

    • @givrally
      @givrally 8 месяцев назад

      One advantage that I *can* see over the method in the video, if I've understood it correctly, is that my method is more secure, it's pretty much impossible for someone to move characters around and find other URLs. Also, not that it matters much, but it also avoids wasting database space. I'm sure as a public service tons of people would be shortening links like google or the rickroll, meaning potentially thousands or even millions of short links would be wasted in our database when a single one would suffice, potentially slowing down performance.

    • @Squirvel
      @Squirvel 8 месяцев назад

      I think this solution is assuming a random distribution or close to random distribution for inputs. This probably doesn't hold true

    • @givrally
      @givrally 8 месяцев назад

      @@Squirvel That's the beauty of hashing functions, it doesn't have to assume anything about the distribution. If you change a single character, the whole hash is different.

  • @spes9850401
    @spes9850401 8 месяцев назад

    I have a question:
    Instead of using 7 characters for 10 years, why don't add one more character to make it work for 6 centuries? Is it too computationally expensive?

  • @Vajradhar
    @Vajradhar 9 месяцев назад

    Great . Well I have a doubt here. Most of them enforce authentication for the long url redirects so it won’t have any attacks from unauthorised users?

  • @tmczq8303
    @tmczq8303 8 месяцев назад

    How about removing the DB step completely? We'd probably expect the URL entry to be gone after configured TTL, high enough to let the user forget about the URL. This makes me wonder the db step is not needed, possible it would be better to ask another question -- How long should the URL entry live? :D

    • @TheRPGminer
      @TheRPGminer 8 месяцев назад

      Imagine going through chat history and seeing that link had expired 😢. Or embedded link stops working.

  • @fritzfahrmann4730
    @fritzfahrmann4730 8 месяцев назад

    why not shard the counters? each is doing a set of ranges? and if some are offline u just loose some ranges.
    the counters just could be balances by round robin dns

  • @mohammadreza.beygifard
    @mohammadreza.beygifard 9 месяцев назад

    Thanks a lot!

  • @thoriqadillah7780
    @thoriqadillah7780 8 месяцев назад

    hi, im junior and interested to learn these kind of things. where do you think i should start?

    • @TechPrepYT
      @TechPrepYT  7 месяцев назад

      Depends on what you want to learn!

  • @SarKurd
    @SarKurd 9 месяцев назад

    Ok I like baldy, great content

  • @bopon4090
    @bopon4090 8 месяцев назад

    Why not ise timestamp with server id for url generation?? My first though is to use time stamp + server id hash. Can someone explain this to me

  • @Simorenarium
    @Simorenarium 9 месяцев назад +1

    I'd separate the Website (URL Creation) and the routing. This would decouple the performance requirements for each type of server.

  • @longshin4299
    @longshin4299 9 месяцев назад

    Great! Can you share the system design PDF book to everyone

    • @TechPrepYT
      @TechPrepYT  8 месяцев назад

      Yes, the text version (with the slides) of the tutorials will be available on the website soon!

  • @ArchonLicht
    @ArchonLicht 8 месяцев назад

    Right off the bat - why not use BASE64 URL variant? Why only 62 chars? BASE64 is already there. Hash, encode as BASE64 - done. Without inventing new custom "standards".

    • @blesswind584
      @blesswind584 2 месяца назад

      You dont want special characters in your shortened URL right? So base64 wouldn't work. Remove the special characters in base64 and you now have base62

    • @ArchonLicht
      @ArchonLicht 2 месяца назад

      @@blesswind584 There is a URL variant of BASE64

  • @tusharjain2604
    @tusharjain2604 8 месяцев назад

    You forget to check if the created tiny url is unique or not

  • @HeellFighter
    @HeellFighter 9 месяцев назад

    It isn't better to use a PUT instead of the POST for idempotency?

    • @LLlAMnYP
      @LLlAMnYP 8 месяцев назад

      Creating a shortened URL isn't necessarily idempotent.

  • @AjayCoding
    @AjayCoding 9 месяцев назад

    great

  • @daveh0
    @daveh0 9 месяцев назад +7

    You've failed the interview. It's not a "super simple REST API". It's just a HTTP API - the "create" in your URL is a big hint that you're not following REST design principles. There's nothing wrong with just doing a super simple HTTP API.
    You should consider tech interviews to be like an exam and be precise in your usage of terminology. What is REST could easily be a warm up question. It's the sort of thing many people don't know!

    • @Ziimzy
      @Ziimzy 9 месяцев назад +1

      I’m curious, I can’t find anything online that states that you can’t have “create” in a REST API. Have you got any good sources on REST APIs that I can read?

    • @daveh0
      @daveh0 9 месяцев назад

      @@Ziimzy Fielding's original paper doesnt mention everything that is not REST but it does tell you how. URLs are supposed to identify resources and you're meant to manipulate those resources through the verbs already provided. Beyond that, Googling for "verbs in url not rest" gives tonnes of info

    • @alexiso2740
      @alexiso2740 8 месяцев назад

      @@Ziimzy You don't use the keyword create because by definition, if it is a POST endpoint, the fact that it is a POST type request then it is a creation.

    • @Ziimzy
      @Ziimzy 8 месяцев назад

      @@alexiso2740 Ah I see, that makes sense

  • @rigveddesai5843
    @rigveddesai5843 9 месяцев назад +1

    how does adding a random suffix protect the URLs from predicting URLs? can you explain with an example?

    • @philipkopylov3058
      @philipkopylov3058 9 месяцев назад +5

      Let's say you if you making url number 1,000,000 and convert in into base62 then it would be 4C92, if you are going sequentially I can tell that next number is 1,000,001, which I could predict and try the next link with 4C93. If you add random suffix, it gets harder to do that.

  • @oelsardine9422
    @oelsardine9422 8 месяцев назад

    0:45 seconds into the video and you don't know what a REST API is.
    Sad.

  • @afadeevz
    @afadeevz 9 месяцев назад +1

    Your audio has some low freqs

    • @Kiliman3970
      @Kiliman3970 9 месяцев назад

      I think that’s the sound of his keyboard. He should probably invest in a microphone arm to keep it off the desk.

  • @sho3bum
    @sho3bum 9 месяцев назад

    How is the collision happening, I don't get that part

    • @marcialabrahantes3369
      @marcialabrahantes3369 9 месяцев назад +1

      The count caches are distributed. It's essentially a random number generator...
      No more than 1 server instance can return a duplicate number

  • @EranM
    @EranM 5 месяцев назад

    shkeama?

  • @renauddanniau676
    @renauddanniau676 9 месяцев назад

    6:50 Cassandra is a row oriented database, not column ? Not sure why it has been use as an argument to emphasize the choice.

  • @osama639
    @osama639 8 месяцев назад

    no bro. this is not REST. REST must NOT contain verbs. Dislike from soyjak

  • @Edekeful
    @Edekeful 8 месяцев назад

    What is this math?

  • @12crenshaw
    @12crenshaw 8 месяцев назад

    Szkima XD

  • @edhahaz
    @edhahaz 8 месяцев назад

    I think it's called a map in a C++ program, running o toaster hosted in your mom's basement: 10trillion req a day.

  • @Ca1vema
    @Ca1vema 9 месяцев назад +2

    What are waste of resources. All of that for what, handling, like at most 500 clicks per second?

    • @vyrp
      @vyrp 9 месяцев назад +4

      He mentioned in the beginning: 1000 new URLs created per second, 10000 queries per second.

  • @KardboardCode
    @KardboardCode 7 месяцев назад

    Amazing crisp video!!

  • @greenhacker0com
    @greenhacker0com 8 месяцев назад

    Do i need to become a Network Engineer to be a Web-developer??

    • @alexiso2740
      @alexiso2740 8 месяцев назад

      No, just understand the key concepts of network like OSI stack and troubleshooting.