Two things make this video stand out for system architecture interviews: 1) general knowledge of the available options, with arguments for and against 2) enough in depth knowledge to go deep and impress
Dude, today I've passed an interview from a first try! Your videos are extremely helpful. I was just putting on a board all you talked about. I'd fail if I hadn't watched your videos. Thanks!
I gotta say, this summary video is great! As much as you dread redundancy here, I at least got a ton of value of out of it. The material is fantastic for reviews Kudos, and great stuff!
Thanks for this video man! While I agree with you that it'd be better to watch your more in depth videos, this compilation video works great for a quick recap right before going into your System Design interviews
Since both SQL and NoSQL DBs are ACID compliant, the key reason to ever choose SQL over noSQL is if you want to join multiple tables with data and if your queries are more aggregate based (ex: SUM, COUNT, AVG) and records are used in a combined manner rather than to store rows of data for unrelated records.
While I think you make some valid points here, I think by default everyone should want to use SQL. Not all NoSQL DBs are ACID compliant, especially in a distributed setting. I agree that tables that store unrelated records generally play nicely with NoSQL, but that doesn't even necessarily warrant using it unless the specific database that you choose gives you some performance improvement that you couldn't have had otherwise.
I've gone through all your concepts and interview video and this video did a great job of summarizing everything! Thanks for everything, giga chad! :P All the best, y'all! Let's get this bread! 🚀
Absolutely great work. Someday you should talk about the interview questions that you asked candidates and any interesting approaches they took and also about some interview questions that zapped you. PS: Towards the start of this video you asked us to get lotion and paper. What gives?
1) I've never interviewed anybody, I'm a sham :) 2) you need the paper to take notes and the lotion to keep the pencil from sticking to your otherwise sweaty hands
You are sharing awesome content. Great to link to for short and acurate explanations. Would be great to see more on Distributed SQL (you did Spanner but there's also YugabyteDB, CockroachDB, TiDB, YDB). And on PostgreSQL compatible databases (you did Aurora but there's also AlloyDB, Neon, YugabyteDB)
watched your video about why you left Google and you mentioned you're a new grad.. extremely impressive you know all of this already! any good books/resources you used? thanks for the videos!
Thanks! I'd just recommend really reading and understanding designing data intensive application, and that should give you all the background that you need to go do more of your own research!
16:30, I haven't heard of column compression being used for image data in the way that you describe here, any pointers on what you were talking about when you mentioned this?
Hey so I don't actually mean to compress the images with column compression: I just mean having a column containing multiple images means that you only have to fetch the images themselves as opposed to potentially a lot of metadata that may come with them (if you were to fetch a row at a time)
@@jordanhasnolife5163 I paused the video at this point in confusion as well, because I'm afraid the example doesn't make much sense. In the query you described, you only want to get the thumbnails associated with a specific video, so you would either implement that with a relational table (full_video_id | thumbnail_id, where one full_video id is associated with one or more thumbnail_ids) or you'd store a list of the thumbnail_ids (pointing to the actual image data in, say, s3) on a document representing the full video. The only situation in which you would possibly want to store images in a column is if you'd want to somehow query ALL thumbnails across ALL videos, but that is not the situation you described - you described getting the thumbnails of a SINGLE video. That would be OLTP/row-based, not OLAP/column-based. Also, columns typically contain primitives (so you could, for example, perform an average across a column of floats)
@@BenLernerOfficial Yes sorry, this is assuming that one video might have many thumbnails (e.g. to create one of those gifs that you see on RUclips now). Sorry this wasn't clear, everything that you've said is accurate.
Hey Jordan, just started watching every video you've created. I love them. I'm wondering how I could get in contact with you as soon as possible. Id like a couple minutes of your time if possible. Thanks x
I have no life! No but actually, I just have optimized my knowledge specifically for the interview haha - I'm sure you all are better software engineers than me
bro I watched your earlier videos in 1.25x speed and now your normal voice feels weird and slow. Nevertheless great and orderly content. Cheers! Would recommend others too :)
Thanks for the video man! it was informative could you please create a video if possible on scenario-based database usage I am really confused about where to properly use sql db and nosql db I am little clear that if we need ACID properties then best is sql. but I am not completely aware of different other scenarios on where to perfectly use sql and nosql dbs. if you also have any resources please share I am not able to find a good one
I think you basically just expressed it yourself - "if you need acid properties use sql" - if data integrity is the most important part of your application, SQL is the way to go. Otherwise, NoSQL can offer greater speed while sacrificing some of these requirements.
@@jordanhasnolife5163 Thanks Jordan I am thinking of a scenario in case of storing product related things I see nosql is best suited as different product could have different properties, but how about managing the inventory for the product? in this case since it requires acid props to manage the inventory count properly, should we maintain the inventory count details alone in sql DB?
which database is of choice when you need SQL database but the dataset is too large and you need to shard the data or the database needs to be distributed?
What if you need a NoSQL store with strong consistency? You need Hbase or MongoDB. And if you need a db optimized for heavy reads, you may need MongoDb since it uses B tree.
Mongo might be better for reading sure, but I caution you from saying it and HBase are strongly consistent. Hadoop has some weird writing thing that kinda makes it strongly consistent, and maybe you can configure mongo to do so, but Hadoop writes aren't like actually achieving consensus (and afaik mongo isn't either), so it's kinda just not great for that haha
@@franklinyao7597 You like write to multiple nodes at once and only get a success message if it's hit a certain amount of them, but the write still goes through on some of the nodes even if you don't meet the success threshold if I remember correctly
Whenever I mention Cassandra in a Systems Design interview, the interviewer always seems to have some horror story concerning it (often its performance!)
@@jordanhasnolife5163 It’s often that “We tried it and it was slower”. I’m guessing they were approaching it as some sort of vertical solution (so a faster RDBMS) than a horizontal solution and retaining an application model that was optimized for single-leader.
I think these are probably worth knowing about from a software engineering perspective but probably not worth using in a design for an interview. Spanner (can't speak for cockroach) is great, but I think it may be too niche to be fair game here (since it doesn't exactly have a "dedicated" use case).
Mainly because bigTable = hbase and dynamo = Cassandra (it actually may not assuming you're talking about dynamodb but theres no docs on internal implementation afaik)
Two things make this video stand out for system architecture interviews:
1) general knowledge of the available options, with arguments for and against
2) enough in depth knowledge to go deep and impress
One of the best videos of its kind.
small inaccuracy: Hbase being wide-column store actually store column families together, not individual columns.
Appreciate it!
Dude, today I've passed an interview from a first try! Your videos are extremely helpful. I was just putting on a board all you talked about. I'd fail if I hadn't watched your videos. Thanks!
Congratulations!! Glad to hear the hard work paid off for you!
finished this series, im proud of myself. you are so funny btw
I gotta say, this summary video is great!
As much as you dread redundancy here, I at least got a ton of value of out of it. The material is fantastic for reviews
Kudos, and great stuff!
Glad you are back with system design videos😭😭
We'll see about that one buddy, these have been covered mostly
Thanks for this video man! While I agree with you that it'd be better to watch your more in depth videos, this compilation video works great for a quick recap right before going into your System Design interviews
Glad to hear!
Since both SQL and NoSQL DBs are ACID compliant, the key reason to ever choose SQL over noSQL is if you want to join multiple tables with data and if your queries are more aggregate based (ex: SUM, COUNT, AVG) and records are used in a combined manner rather than to store rows of data for unrelated records.
While I think you make some valid points here, I think by default everyone should want to use SQL. Not all NoSQL DBs are ACID compliant, especially in a distributed setting.
I agree that tables that store unrelated records generally play nicely with NoSQL, but that doesn't even necessarily warrant using it unless the specific database that you choose gives you some performance improvement that you couldn't have had otherwise.
I finished the whole series :) , wish me luck on my System Design interview
You got this!
I've gone through all your concepts and interview video and this video did a great job of summarizing everything!
Thanks for everything, giga chad! :P
All the best, y'all! Let's get this bread! 🚀
wow
2.5x speed. Interview in 19 hours. Let's go
Lfgo
Absolutely great work. Someday you should talk about the interview questions that you asked candidates and any interesting approaches they took and also about some interview questions that zapped you.
PS: Towards the start of this video you asked us to get lotion and paper. What gives?
1) I've never interviewed anybody, I'm a sham :)
2) you need the paper to take notes and the lotion to keep the pencil from sticking to your otherwise sweaty hands
You are sharing awesome content. Great to link to for short and acurate explanations.
Would be great to see more on Distributed SQL (you did Spanner but there's also YugabyteDB, CockroachDB, TiDB, YDB). And on PostgreSQL compatible databases (you did Aurora but there's also AlloyDB, Neon, YugabyteDB)
Nice idea! And thank you!
This is what i'm look for! Great quality - thank you very much!
watched your video about why you left Google and you mentioned you're a new grad.. extremely impressive you know all of this already! any good books/resources you used? thanks for the videos!
Thanks! I'd just recommend really reading and understanding designing data intensive application, and that should give you all the background that you need to go do more of your own research!
B-tree writes can go to memory too. It's called buffer pools.
Good point
huh, i subbed for day in the life vids 😒
I'll sell out soon I promise
Great Video, One question, where can we learn about db schema design? Some basics and exercises would be good, any online course you recommend?
I'd just look at database docs and existing engineering blogs from reputable companies!
Thanks for the nice series. I really liked your videos
this is an awesome video! thanks for such a great summary
Great work and amazing video! Could you also make more low level design videos?
Thank you senpai 🙏🏽
16:30, I haven't heard of column compression being used for image data in the way that you describe here, any pointers on what you were talking about when you mentioned this?
Hey so I don't actually mean to compress the images with column compression:
I just mean having a column containing multiple images means that you only have to fetch the images themselves as opposed to potentially a lot of metadata that may come with them (if you were to fetch a row at a time)
@@jordanhasnolife5163 I paused the video at this point in confusion as well, because I'm afraid the example doesn't make much sense. In the query you described, you only want to get the thumbnails associated with a specific video, so you would either implement that with a relational table (full_video_id | thumbnail_id, where one full_video id is associated with one or more thumbnail_ids) or you'd store a list of the thumbnail_ids (pointing to the actual image data in, say, s3) on a document representing the full video. The only situation in which you would possibly want to store images in a column is if you'd want to somehow query ALL thumbnails across ALL videos, but that is not the situation you described - you described getting the thumbnails of a SINGLE video. That would be OLTP/row-based, not OLAP/column-based. Also, columns typically contain primitives (so you could, for example, perform an average across a column of floats)
@@BenLernerOfficial Yes sorry, this is assuming that one video might have many thumbnails (e.g. to create one of those gifs that you see on RUclips now). Sorry this wasn't clear, everything that you've said is accurate.
Another common use case is to load all thumbnails for a user's channel, such as if you were to click my channel page.
Hey Jordan, just started watching every video you've created. I love them. I'm wondering how I could get in contact with you as soon as possible. Id like a couple minutes of your time if possible. Thanks x
LinkedIn would probably be best, my name is Jordan Epstein
@@jordanhasnolife5163 thank you, sent a msg ^_^
Thank you, Jordan!
Thank you buddy!
how do you gain some much knowledge in system design? really amazing!
I have no life!
No but actually, I just have optimized my knowledge specifically for the interview haha - I'm sure you all are better software engineers than me
@jordanhasnolife5163 lol no. I'm trying to learn from you and get better :)
Really good one! Thank you Jordan! =)
bro I watched your earlier videos in 1.25x speed and now your normal voice feels weird and slow. Nevertheless great and orderly content. Cheers! Would recommend others too :)
Damn bro 1.25? Gotta speed that up to 2
Thank you for the great content
why redis instead of just using the hashmap in your program? for cross process communication?
Well sometimes you want many servers, sometimes you want replication, sometimes you want a writeahead log, sometimes you want database partitioning
Thanks for the video man! it was informative
could you please create a video if possible on scenario-based database usage I am really confused about where to properly use sql db and nosql db
I am little clear that if we need ACID properties then best is sql.
but I am not completely aware of different other scenarios on where to perfectly use sql and nosql dbs. if you also have any resources please share I am not able to find a good one
I think you basically just expressed it yourself - "if you need acid properties use sql" - if data integrity is the most important part of your application, SQL is the way to go. Otherwise, NoSQL can offer greater speed while sacrificing some of these requirements.
@@jordanhasnolife5163 Thanks Jordan
I am thinking of a scenario in case of storing product related things I see nosql is best suited as different product could have different properties, but how about managing the inventory for the product?
in this case since it requires acid props to manage the inventory count properly, should we maintain the inventory count details alone in sql DB?
which database is of choice when you need SQL database but the dataset is too large and you need to shard the data or the database needs to be distributed?
A SQL database lol. You can still shard your data here, just be smart about how you do it.
Could you please make a video on Wide column vs column family vs columnar vs column oriented DBs with some examples
Hey! I think I probably mentioned this more in the 1.0 series but not sure that it deserves a full video, just look up images of the formats :)
@@jordanhasnolife5163 , please give me link of that video
What if you need a NoSQL store with strong consistency? You need Hbase or MongoDB. And if you need a db optimized for heavy reads, you may need MongoDb since it uses B tree.
Mongo might be better for reading sure, but I caution you from saying it and HBase are strongly consistent. Hadoop has some weird writing thing that kinda makes it strongly consistent, and maybe you can configure mongo to do so, but Hadoop writes aren't like actually achieving consensus (and afaik mongo isn't either), so it's kinda just not great for that haha
@@jordanhasnolife5163 what is that weird writing thing?
@@franklinyao7597 You like write to multiple nodes at once and only get a success message if it's hit a certain amount of them, but the write still goes through on some of the nodes even if you don't meet the success threshold if I remember correctly
B-Trees are not binary trees. The video itself is still quite good.
Oops typo
Whenever I mention Cassandra in a Systems Design interview, the interviewer always seems to have some horror story concerning it (often its performance!)
Interesting, I'd be curious if you pushed back on them a bit to ask them what the workload was and why the performance was poor what they'd say!
@@jordanhasnolife5163 It’s often that “We tried it and it was slower”. I’m guessing they were approaching it as some sort of vertical solution (so a faster RDBMS) than a horizontal solution and retaining an application model that was optimized for single-leader.
What about distributed sql databases like spanner/cockrorachdb?
I think these are probably worth knowing about from a software engineering perspective but probably not worth using in a design for an interview. Spanner (can't speak for cockroach) is great, but I think it may be too niche to be fair game here (since it doesn't exactly have a "dedicated" use case).
Why no honorable mention of Dynamo & BigTable ?😀
Mainly because bigTable = hbase and dynamo = Cassandra (it actually may not assuming you're talking about dynamodb but theres no docs on internal implementation afaik)
Are your slides available to view/download somewhere?
In my channel description
hahahah i just like how he call us , you lazy f**s and do it
Nice
Are trees with more than two children for a given parent still considered binary trees?
Nope
Salute😊
Finalyyyyyyyyyyy
Scylla DB ??
I'd consider it a Cassandra clone
Kudos!
Did the Goat just say he’s insecure ?
You think a secure person would spend multiple years of their life lifting weights and studying systems design?? 😭
Yay for Women!
Just defended women against a mysognist on Xbox live the other day
@@jordanhasnolife5163 Yay Jordan! 🤗 lol
No S3 🥲
Not a database - though technically some cloud native data warehouses are being built using s3 as the storage layer and parquet files
Talking too fast.
Can't wait for you to discover .75 speed
Playing at 1.5x. His speed is just fine
Talking too sparse
This guy stores! 🫣