6:08 Search index - partitioning issues 8:22 Adding items to carts: Real time updates, Version vectors, CRDTs 13:42 Placing an order: Writeback Cache 16:35 Streaming/batch stock updates 17:55 Architecture
Would a cache be a decent option for the management of carts? A) it’s transient data B) updates and fetching would be quick 3) Cost is not going to be a huge problem considering that we’re storing pretty less data on the cache
Appreciate it Edmond, thanks for watching! Always helps that no one gets anything done at Google, who knows what would be happening if I were at Amazon haha
Hi jordan , in the product table (mongoDB) you talked about , are you sharding the product table itself or you are sharding the search cluster built over this product table? Also is this doable to shard the cluster indexes in mongodb.
Well version vectors don't allow us to do any merging at the database level, they do allow us to detect conflicts though so that we can store siblings and do merging on the user level. I'd use a CRDT if they can handle merging the data, but if it's too complicated, version vectors and user merging may be better.
Hey Jordan, I'm reviewing this design and have a question about the usage of message broker, specifically in the product update/search part. The product service updates the mongoDB table which then sends change-data to a Kafka queue to update the search index. Why do we use a message queue there between MongoDB and Search Index, but we don't use it between the product service and MongoDB? I'm trying to understand when to use a message queue and how to justify the usage. Again thanks for the great content, this video is specially clear and I learned a lot.
Hey Yiwei! Thanks! I think the message queue was used here for between the db and search index because afaik change data in mongo has to be sent to a queue :) Could be wrong though, you're right in that it's not inherently necessary in this case
Hey Jordan, love your channel! This is my first time ever doing system design and your vids help so much. One question -- The way that the diagram has the LB is a little confusing, wouldn't an API gateway direct traffic to the correct service, and then LB placed in front of each service to handle distributing traffic between the replicated server instances? Or can a load balancer alone accomplish traffic directing too?
Hey Jordan, Thanks for all these videos! I had a question regarding the choice of technologies in the architecture. At my job, I have used things like MongoDB, Dynamo, MySQL etc. but the applications that I have worked on usually had less than 1k users per day so we didn't really run into many concurrency or scaling issues. Now that I know about stuff like Riak, Flink etc. it makes sense to use them but in practice I have no idea how to implement them. In this case, do you think it would be a good idea to even talk about them in the interview? What if the interviewer asks me to double click on one of these technologies? I probably wouldn't know the answer... Wondering if it will be a good idea to stick with what I know and mention stuff like "btw, I heard some technologies that support X, I would love to explore them." etc. during the interview.
Also, are you really this funny in real life or do you think about this content before making these vids? lol I bet you pull like nobody's business because you're pretty funny!
1) I'd probably say: even though I haven't used these as much in my professional career, I think that theoretically for this type of problem we want want to use x 2) Still no bitches 😭
Great video Jordan! Keep going! I have some questions tho. Why did you use MYSQL to store the orders? How about something like Mongo? Also, if SQL is the choice, how would the schema look like since we are partitioning the Kafka queue by product id? Thank you 🙂
Yeah I think that you could use another DB than MySQL since we don't really need transactions for that part. As for your second question, I think maybe something just like orderId productId quantity would work for schema.
Great content. I am learning the design and have a question : Whenever the order service in the diagram (since 17:54) receives a order request, does it require another (relational) database for saving the order first, and then push the same order to Kafka queue for inventory function (which includes another set of databases) ? so in the diagram there would be another link from `order service block` to `SQL DB icon` with `Kafkaqueue icon` side by side, does that make sense ? thanks
I don't think it requires it, but you certainly could do that. In the end you can give the order an ID before placing it in the Kafka queue so that you can reassemble the order again later from the individual items.
yea, it may be good to assemble the same order again for reasons like status update or invoice (maybe integration with other accounting software if required) ... etc
Hi, Jordan, at 13:48 , you described that atomic operations and locks in database might slow down the application when a lot of customers are placing new orders concurrently, one of feasible option for the improvement is write-back cache. Can you provide links to relevant online articles or discussions describing similar write-back-cache idea ? (I watched the ticket-master design video from this channel, just wonder any other discussion which describes similar idea) , thanks again.
Definitely a solution, but let me remind you that it's still the case that you'd need something like a websocket to propagate the changes, unless a special database set structure was used. Agreed on the lower throughput part.
@@jordanhasnolife5163the database and the multi tenant aspect of it. Im trying to improve my database design skills. Would you say nosql would work better for something like shopify?
Well let me answer your question with a question? Is there anything else you want me to cover? If not and you just want something more formal, designing data intensive applications
Debating whether it's fundamentally any different than something like a chat application - as far as I can think of for now it seems somewhat similar, with the exception of not allowing illegal bids, meaning that you'd probably have to use atomics on a database if you want synchronous responses, or if stream processing is okay you could use that as well.
@@grabarzowaty Haha I think you may be slightly overestimating how lucrative youtube is compared to software engineering, but perhaps one day I can live the sigma male dream lol
I am the first viewer. So what are the to-make videos? Options: - Payment like digit wallet or Paypal as an example of using transactions. - Real-time gaming leaderboard ...
@@jordanhasnolife5163 Yes. Leaderboard would be nice to explore. Then I'll like to know about an online chess game (or any online board games). I believe CRDTs will come in handy here.
6:08 Search index - partitioning issues
8:22 Adding items to carts: Real time updates, Version vectors, CRDTs
13:42 Placing an order: Writeback Cache
16:35 Streaming/batch stock updates
17:55 Architecture
Would a cache be a decent option for the management of carts?
A) it’s transient data
B) updates and fetching would be quick
3) Cost is not going to be a huge problem considering that we’re storing pretty less data on the cache
Seems very reasonable to me!
I liked just for the intro.
Jordan, your content is super authentic and your style of presenting is very practical without any bullshit. Keep up the great work!!!
Thanks Kishore!
Great content! It's been a productive summer for you Jordan
Love the jokes too
Appreciate it Edmond, thanks for watching! Always helps that no one gets anything done at Google, who knows what would be happening if I were at Amazon haha
I dig your vibe throughout this video bro! Looking forward to more of this.
Thanks man! Stay tuned for 5pm EST today in that case! Glad to have you as part of the community.
Hi jordan , in the product table (mongoDB) you talked about , are you sharding the product table itself or you are sharding the search cluster built over this product table? Also is this doable to shard the cluster indexes in mongodb.
I think that both would be required, and yes.
version technique update is called optimistic locking
After this intro I neeeeed the AWS system Design Video! Time to sub and wait
Edit: Also am a big fan of the the references to other videos
Haha if I do that it'll have about 50 parts
😊
😊😊
When is the AWS System Design video coming ?
We could have a series on AWS where you do an in-depth coverage of various AWS services.
Once I start smoking crack (shouldn't be too long at this rate)
Hi Jordan, when would you prefer CRDT vs version vectors for merging data between nodes?
Well version vectors don't allow us to do any merging at the database level, they do allow us to detect conflicts though so that we can store siblings and do merging on the user level.
I'd use a CRDT if they can handle merging the data, but if it's too complicated, version vectors and user merging may be better.
@@jordanhasnolife5163 thank you
Hey Jordan,
I'm reviewing this design and have a question about the usage of message broker, specifically in the product update/search part. The product service updates the mongoDB table which then sends change-data to a Kafka queue to update the search index.
Why do we use a message queue there between MongoDB and Search Index, but we don't use it between the product service and MongoDB? I'm trying to understand when to use a message queue and how to justify the usage.
Again thanks for the great content, this video is specially clear and I learned a lot.
Hey Yiwei! Thanks!
I think the message queue was used here for between the db and search index because afaik change data in mongo has to be sent to a queue :)
Could be wrong though, you're right in that it's not inherently necessary in this case
@@jordanhasnolife5163 thanks Jordan, my question is more about why we choose NOT to use a MQ between the product service and the MongoDB?
@@yiweizhang286 I'd say when you can avoid having more components you probably should, it just adds complexity
Hey Jordan, love your channel! This is my first time ever doing system design and your vids help so much. One question -- The way that the diagram has the LB is a little confusing, wouldn't an API gateway direct traffic to the correct service, and then LB placed in front of each service to handle distributing traffic between the replicated server instances? Or can a load balancer alone accomplish traffic directing too?
Yeah I tend to lump them together, but you're correct!
@@jordanhasnolife5163 thanks for the response and helping me to become the best giga chad I can be 😂 lol
Hey Jordan,
Thanks for all these videos!
I had a question regarding the choice of technologies in the architecture.
At my job, I have used things like MongoDB, Dynamo, MySQL etc. but the applications that I have worked on usually had less than 1k users per day so we didn't really run into many concurrency or scaling issues.
Now that I know about stuff like Riak, Flink etc. it makes sense to use them but in practice I have no idea how to implement them.
In this case, do you think it would be a good idea to even talk about them in the interview?
What if the interviewer asks me to double click on one of these technologies? I probably wouldn't know the answer...
Wondering if it will be a good idea to stick with what I know and mention stuff like "btw, I heard some technologies that support X, I would love to explore them." etc. during the interview.
Also, are you really this funny in real life or do you think about this content before making these vids? lol
I bet you pull like nobody's business because you're pretty funny!
1) I'd probably say: even though I haven't used these as much in my professional career, I think that theoretically for this type of problem we want want to use x
2) Still no bitches 😭
Great video Jordan! Keep going! I have some questions tho. Why did you use MYSQL to store the orders? How about something like Mongo? Also, if SQL is the choice, how would the schema look like since we are partitioning the Kafka queue by product id? Thank you 🙂
Yeah I think that you could use another DB than MySQL since we don't really need transactions for that part. As for your second question, I think maybe something just like orderId productId quantity would work for schema.
Great content.
I am learning the design and have a question :
Whenever the order service in the diagram (since 17:54) receives a order request, does it require another (relational) database for saving the order first, and then push the same order to Kafka queue for inventory function (which includes another set of databases) ?
so in the diagram there would be another link from `order service block` to `SQL DB icon` with `Kafkaqueue icon` side by side, does that make sense ?
thanks
I don't think it requires it, but you certainly could do that. In the end you can give the order an ID before placing it in the Kafka queue so that you can reassemble the order again later from the individual items.
yea, it may be good to assemble the same order again for reasons like status update or invoice (maybe integration with other accounting software if required) ... etc
Hi, Jordan, at 13:48 , you described that atomic operations and locks in database might slow down the application when a lot of customers are placing new orders concurrently, one of feasible option for the improvement is write-back cache.
Can you provide links to relevant online articles or discussions describing similar write-back-cache idea ? (I watched the ticket-master design video from this channel, just wonder any other discussion which describes similar idea) , thanks again.
If someone made order then I think we should reduce amount of orders in product DB but how to do that?
I was proposing we do it in microbatches - have a stream processing framework collect them and every 10 or so minutes update the count
Can I mention DB locking to deal with concurrent editing on the shopping carts? Although lower throughput.
Definitely a solution, but let me remind you that it's still the case that you'd need something like a websocket to propagate the changes, unless a special database set structure was used. Agreed on the lower throughput part.
Hey jordan can you do a system design for something like shopify?
Can you try and elaborate on which part of Shopify you think is tough to design?
@@jordanhasnolife5163the database and the multi tenant aspect of it. Im trying to improve my database design skills. Would you say nosql would work better for something like shopify?
like your attitude to those asking to design AWS. 😄
Their wives are mine
Can you recommend other resources for system design?
Well let me answer your question with a question? Is there anything else you want me to cover? If not and you just want something more formal, designing data intensive applications
@@jordanhasnolife5163 maybe videos on low level and high level designs both
New potential topic to cover: eBay (Auction System)?
Will look into this as well!
Debating whether it's fundamentally any different than something like a chat application - as far as I can think of for now it seems somewhat similar, with the exception of not allowing illegal bids, meaning that you'd probably have to use atomics on a database if you want synchronous responses, or if stream processing is okay you could use that as well.
I understand shit about system design, here only to enjoy jerk jokes
I'm not joking 😈
Keep the jokes. Don'e be a chicken.
Alright but if I get fired you're gonna have to pimp me out
@@jordanhasnolife5163 Best fire is HR violation fire. In the worst case you will monetize om yt your personal life dramas.
@@grabarzowaty Haha I think you may be slightly overestimating how lucrative youtube is compared to software engineering, but perhaps one day I can live the sigma male dream lol
Slid into the interviewer's wife's DMs. She hit me with a LC hard
Is that why you hit her with a hard of your own?
I am the first viewer. So what are the to-make videos? Options:
- Payment like digit wallet or Paypal as an example of using transactions.
- Real-time gaming leaderboard
...
Thanks for the suggestions, I'll think about those and if I feel the solutions are unique enough I'll probably go ahead and add them! Thanks Franklin!
@@jordanhasnolife5163 Yes. Leaderboard would be nice to explore. Then I'll like to know about an online chess game (or any online board games). I believe CRDTs will come in handy here.
@@2tce Will try to make a gaming video at some point, but do note that there is a leaderboard video already up (#19)