I'm reading 'Designing data intensive applications' and needed a reminder of what exactly LSM trees were, this video is amazingly helpful! Everything I've learned so far is clearer!
Same here. I wish there was a channel that summarizes DDIA concepts chapter-by-chapter, so we can watch the videos after reading the text to further solidify our understanding.
was going through "data intensive applications" book and things were like plain text, came here - watched the video , and now the same text is forming images in my mind. Thanks man, you have created a perennial resource for system design topic.
*Explaining Bloom filter* - example word "cat", let's think of another word with c... "corona" 🤣I had to check the date of the video and indeed it's post the COVID-19 outbreak. Keep up the good work! Amazing content!
You look very happy in this video, Gaurav! Nice to see this as I'm starting interviews again -- you helped me out last round, glad to have you onboard this time too :)
I just finished STORAGE and RETRIEVAL from Designing Data Intensive Application and this video came. It gives more insights and flavor to our knowledge.
At the end of the video, immediately related to Cassandra DB/ Elastic Search internal architecture (also learnt from one your videos😃). We are writing data sequentially and periodically dumping it into sorted string tables. So to reduce number of operations and better read times, we are occasionally merging those sorted string tables. So the combination of Linked list and Sorted Arrays help us scale our database by providing better write/read operations. Great Content. Thanks a lot Gaurav!
I think the meaning of Compaction which you explain missing a major point here. Compaction means throwing away duplicate keys in the log as there can be multiple updates for a single key. With compaction we keep only the most recent onces.
I am so happy to watch this playlist. Most of time i read on platforms like quora or reddit that developers never used knowledge of DSA in their work. And everything was just to pass the interview test. But this series proves it wrong. If someone is working on developing a database or some services thst will be used as blackbox by others. Then knowledge of dsa will be so useful. Please correct me, if i am wrong. And add some more egs if i am right.
Great channel, thanks a lot. I personally would love to see some real world examples of that (open-source code etc.) maybe as a posts/short vids. I know it’s very time consuming stuff, but I bet it’s very interesting to see and understand how real world frameworks/projects use fundamental knowledge under the hood.
Wow..never thought abt the database optimization from write and read perspective...always thought abt query optimzation Thanks a lot for providing such a useful information..thanks again and keep doing this great work
We can divide the searches into range of chunks also. First chunk ranges from say 1 to 30, second chunk 30 to 60 ..and so on .If we want to know if a number 45 is present in the DB or not, we will do binary search on the second chunk
Hi Gaurav, Thanks for video. With B+ tree we have read and write complexity as O(logn) where as you claim with your approach we can make writes O(1) or less than log(N). The concern i am having is, it will vary a lot depending upon how many sorted chunks you have along with how many merges you would be doing and size of sorted chunks etc. Along with that we have a lot of processing and complexity of implementing/maintaining /monitoring these sorted chunks and bloom filters+corner cases etc. There could be certain cases this approach may be better than B+ trees but there could be some cases where B+ trees will stand out. example could be like fb we have billions of records to process. How many chunks would you make and then how many would you merge and apply bloom filters on ? you cannot go with 6 chunks in your example but may be you need to go with 60k to 6M chunks and then apply processing 1st level search on 6M chunks and second level processing within that chunk + bloom filter etc. How would you handle record deletion and how many things you need to change (chunks) , bloom filter and even more may be.
i'm just looking for this kind of reponse B+ trees tend to offer predictable performance across read-heavy workloads, and their balancing properties ensure O(log n) read and write times even at scale. While bloom filters and chunk-based approaches shine in write-heavy workloads (like log-structured merges in LSM-trees), they may struggle when reads need to scale efficiently across massive datasets, especially with frequent deletions and updates.
I have a good understanding of how MySQL does all of this, and the only real advantage to this queuing technique is multi-row inserts. Inserting multiple rows at once is faster than doing the same number of individual row inserts. When you insert data into the (InnoDB) storage engine, it stores the data records directly in the primary key BTree, so the sorting is done upon insert. If you want to optimize bulk inserts, simply create a queue of records and periodically dump that queue to the database. The types of workloads that benefit from queued inserts would primarily be ETL data transfers.
you did not consider duplicates caused by updating records. found ur video helpful while reading the Designing Data-Intensive Applications book, I'm exactly at this topic, thanks for explaining it in short yet covering the most.
2:17 it'll actually be slower if you do it in C/C++ (etc) if you do dynamic allocation for every element at runtime. You'll also lose out massively on caching performance. You should use a linked list with consistent size arrays (chunks). O(1) insertion + cache optimization.
It would be good if you can make a video on "which storage engine to use and when", technically telling us when to use traditional sql vs nosql. I know there is no straightforward answer to it but just to know your view.
Correct me if wrong, but this is for Analytics DB. For transactional DBs which needs to change/delete value, sorted array would slow down writes. Transactional DBs still need to use B+ trees (at least SQL variants). Database with heavy write doesn't translate to Analytics DB, imagine a scenario, if you're storing a background job's state in DB and the number of jobs scale up to 1 billion for 1 billion users every 2 hours - you have 1 billion records that are ingested every hour but at same time need to be updatable.
The problem with Gaurav Sen's videos is that if its 17:22 minutes long then you need watch it exactly for 17:22 minutes to understand the concept. The moment you skip forward for 5 seconds anywhere then you need to rewind for 20 seconds.
Really nice video, few questions 1. What happens if the server goes down before we get a chance to flush the data from memory to the DB? 2. If we already know the ideal size of the segment to read the data efficiently, why not start with having the memory of the same size instead of opting in to merge the sorted segments of a smaller size?
8:03 It's still O(nlog(n)) imo. Because in the worst-case scenario you'll go over every single chunk (n) and binary search on them (log(n)) as you can't really separate them into sequentially sorted chunks. (e.g first chunk holds sorted values from 1 to 50, a second chunk from 51 to 100...). This would require sorting every chunk in O(nlog(n)) time because we're getting random values in every query block.
an example maybe you could give here for a bloom filter is - considering the 0th and nth element of the sorted array. If I need to search for a number and if that number is within the range of the smallest and highest, there is a chance it could be there (it may not be there as well), however if the search number is either lesser than the a[0] (smallest number in the sorted array) or greater than a[n] (largest number in the sorted array) there is no way it could be there. Hope that helps.
Hi Gaurav, Pardon me if I understood something wrong but: 1. Around 7:45, if we have 10,000 sorted records (n) in SST and we are trying to insert 6 new records (m), why do we need to sort (10,000 + 6) records? Instead, can't we do binary search on 10,000 records to find insertion position (log n) of the 6 records? Then time complexity will be = O(m * log n) = 6 * log(10,000) = 60 2. Also, if we go with the approach of merging the sorted arrays of size n (two arrays of 6 elements each merging into one array of 12 elements), why do we need to traverse all the arrays to insert a record x? Since the arrays are already sorted, we can easily compare starting and ending element of the array with x. If starting element is more than x or ending element is less than x, we don't need to traverse that array and can check another array where a[0]
I believe compaction uses optimal merge strategy similar to huffman coding. Take lists of smaller sizes first and then merge the remaining one's, which minimizes number of merge operations.
I think thats how cassandra works to make writes faster .. mem table in memory and flushed to sst for good write latencies and using bloom filters/row and key caches to reduce read latency!
Appending at the end of the LinkedList, the time complexity is O(N) (we need to traverse through the linked list and we need to append), Correct me If i'm wrong
Hi Gaurav, Thanks for the video. I have a doubt regarding constant time for write operation. 1) Whenever a write request is triggered it takes log m(where m is size of memtable) to insert in the memtable right? 2) write query is comprised of 2 operations right? i) memtable ii) log file
Why. I feel that it's same as b+ tree overall.. because if we see b+ tree abstractly, it's also maintaining sorted chunks of array...but ya we don't have control over size and when to merge two sorted chunks in b+.. in b+ tree.. insertion is always log(n).. While here it most of the time . constant..(I guess) reading operation is bit heavy here..but I guess that filter can help to reduce time.. Anyway, nice video..
Are the previous bloom filters of any help in the generation of the new bloom filter? As you said there is a high chance of having a lot of false positives in the new array if we don't increase the bits used by the new bloom filter. So, I'm assuming we need to again recalculate the new bloom filter discarding the previous two bloom filters with lower accuracies.
but searching on bloom filter means need to iterate query string and get hash value...... as the size of string increasing the operation will increase which can be greater than our binary search operation.
2:29 I don't get it, why do you care about read speed on the linked list? Doesn't the database have to read it in order? Or is it used as some kind of cache?
Hi I know this is weird to ask but can you make a "How to prepare for competitive programming for absolute begginer" video again , like a updated version and explain more
Go through errichto's channel on RUclips. He's a top competitive programmer and can provide you useful techniques. (Not taking anything away from this channel)
@@ishaangupta4125 you gave a very good information, just to add on top. Competitive Coding and Interview preparation are quite different, so even though topics are same but strategy of preparation is different
When notification arrives from gaurav.. Me: Hey, you don't know anything about this concept. Don't waste your time by watching this. My Brain : you stop, gaurav makes the thinkgs clear..
When is sorted chunk created? Is it during insertion or sometime later ? What happen when its get a query for the data which is not yet in the sorted chunk ?
A large DB will typically be having millions of entries, so will the merging not be very inefficient in that case? Since merging two sorted arrays has TC of O(n1+n2). Even if this compaction happens async, it can still slow down the DB because it will happen frequently.
@@gkcs No of entries will depend on data size of each unit, but let's say that the data is 100GB. Sorting that much of data frequently will not be efficient.
If you are preparing for a system design interview, try get.interviewready.io.
All the best!
"You can think of your database as a data structure" Man... I wish I've had someone explain things so comprehensive when I was in school
I'm reading 'Designing data intensive applications' and needed a reminder of what exactly LSM trees were, this video is amazingly helpful! Everything I've learned so far is clearer!
Same here. I wish there was a channel that summarizes DDIA concepts chapter-by-chapter, so we can watch the videos after reading the text to further solidify our understanding.
was going through "data intensive applications" book and things were like plain text, came here - watched the video , and now the same text is forming images in my mind. Thanks man, you have created a perennial resource for system design topic.
same here :)
Same :here 😅
*Explaining Bloom filter* - example word "cat", let's think of another word with c... "corona" 🤣I had to check the date of the video and indeed it's post the COVID-19 outbreak. Keep up the good work! Amazing content!
even i checked the date lol :P
Good one. This video summarizes most of the 3rd chapter in Designing Data Intensive Applications.
Thanks :)
which book?
You look very happy in this video, Gaurav! Nice to see this as I'm starting interviews again -- you helped me out last round, glad to have you onboard this time too :)
All the best!
I just finished STORAGE and RETRIEVAL from Designing Data Intensive Application and this video came. It gives more insights and flavor to our knowledge.
At the end of the video, immediately related to Cassandra DB/ Elastic Search internal architecture (also learnt from one your videos😃). We are writing data sequentially and periodically dumping it into sorted string tables. So to reduce number of operations and better read times, we are occasionally merging those sorted string tables. So the combination of Linked list and Sorted Arrays help us scale our database by providing better write/read operations.
Great Content. Thanks a lot Gaurav!
Thank you!
"Think of a Database as a DataStructure" is like the keyword for us to imagine and understand things better...thanks a lot dude !!
Man this guy really knows how to teach. Keep it up Gaurav!
I think the meaning of Compaction which you explain missing a major point here.
Compaction means throwing away duplicate keys in the log as there can be multiple updates for a single key. With compaction we keep only the most recent onces.
I am so happy to watch this playlist.
Most of time i read on platforms like quora or reddit that developers never used knowledge of DSA in their work. And everything was just to pass the interview test.
But this series proves it wrong.
If someone is working on developing a database or some services thst will be used as blackbox by others. Then knowledge of dsa will be so useful.
Please correct me, if i am wrong. And add some more egs if i am right.
Great channel, thanks a lot. I personally would love to see some real world examples of that (open-source code etc.) maybe as a posts/short vids. I know it’s very time consuming stuff, but I bet it’s very interesting to see and understand how real world frameworks/projects use fundamental knowledge under the hood.
You can go through cassandra's docs: docs.datastax.com/en/archived/cassandra/3.0/cassandra/dml/dmlHowDataWritten.html
Wow..never thought abt the database optimization from write and read perspective...always thought abt query optimzation
Thanks a lot for providing such a useful information..thanks again and keep doing this great work
Bhai tumi guruji. Best explanation with real world examples
Zooming into your eye was goooood so was once more. I think your’s and Clement’s videos are like fun with learning. No jokes on Card Tricks though.
Finally someone explained LSM tree clearly
OMG this guy gives a much much much better explanation than my professor.
If you're here because you heard LSTM at the end and were like "Wait, what ?", He meant LSMT ( Log Structured Merge Trees ) :)
Yes, I messed up with neural networks :P
no wasting time it's really good for those who had know all this stuffs and thank you Brother it's so help full for me to expand my knowlage
We can divide the searches into range of chunks also. First chunk ranges from say 1 to 30, second chunk 30 to 60 ..and so on .If we want to know if a number 45 is present in the DB or not, we will do binary search on the second chunk
Sharding ?
Hi Gaurav, Thanks for video. With B+ tree we have read and write complexity as O(logn) where as you claim with your approach we can make writes O(1) or less than log(N). The concern i am having is, it will vary a lot depending upon how many sorted chunks you have along with how many merges you would be doing and size of sorted chunks etc. Along with that we have a lot of processing and complexity of implementing/maintaining /monitoring these sorted chunks and bloom filters+corner cases etc. There could be certain cases this approach may be better than B+ trees but there could be some cases where B+ trees will stand out. example could be like fb we have billions of records to process. How many chunks would you make and then how many would you merge and apply bloom filters on ? you cannot go with 6 chunks in your example but may be you need to go with 60k to 6M chunks and then apply processing 1st level search on 6M chunks and second level processing within that chunk + bloom filter etc. How would you handle record deletion and how many things you need to change (chunks) , bloom filter and even more may be.
i'm just looking for this kind of reponse
B+ trees tend to offer predictable performance across read-heavy workloads, and their balancing properties ensure O(log n) read and write times even at scale.
While bloom filters and chunk-based approaches shine in write-heavy workloads (like log-structured merges in LSM-trees), they may struggle when reads need to scale efficiently across massive datasets, especially with frequent deletions and updates.
There are very few sights more beautiful than the thumbnail of Gaurav Sen in front of a white board
Would like a video on internals of in-memory databases that also offer persistence.
Hi Gaurav - this video makes a lot of sense after solving for how to merge k sorted lists .
I have a good understanding of how MySQL does all of this, and the only real advantage to this queuing technique is multi-row inserts. Inserting multiple rows at once is faster than doing the same number of individual row inserts. When you insert data into the (InnoDB) storage engine, it stores the data records directly in the primary key BTree, so the sorting is done upon insert. If you want to optimize bulk inserts, simply create a queue of records and periodically dump that queue to the database. The types of workloads that benefit from queued inserts would primarily be ETL data transfers.
You are the best. I’ve learned so much from your videos! Thanks so much for sharing!
Thank you Vinnie!
Great video. You are incredible in presenting stuff. Keep up the good work!
you did not consider duplicates caused by updating records. found ur video helpful while reading the Designing Data-Intensive Applications book, I'm exactly at this topic, thanks for explaining it in short yet covering the most.
Thanks! I have talked a little more about this in the "Introduction to NoSQL" video. It's mentioned with the SST tables, I think.
I was asked about Bloom Filters recently in a interview!
Binge watched 16 out of 23 videos of this series!
2:17 it'll actually be slower if you do it in C/C++ (etc) if you do dynamic allocation for every element at runtime. You'll also lose out massively on caching performance. You should use a linked list with consistent size arrays (chunks). O(1) insertion + cache optimization.
We need more videos this time if u keep uploading everyday...that might help
It would be good if you can make a video on "which storage engine to use and when", technically telling us when to use traditional sql vs nosql. I know there is no straightforward answer to it but just to know your view.
Correct me if wrong, but this is for Analytics DB. For transactional DBs which needs to change/delete value, sorted array would slow down writes. Transactional DBs still need to use B+ trees (at least SQL variants). Database with heavy write doesn't translate to Analytics DB, imagine a scenario, if you're storing a background job's state in DB and the number of jobs scale up to 1 billion for 1 billion users every 2 hours - you have 1 billion records that are ingested every hour but at same time need to be updatable.
@Gaurav amazing video, I wish I could hit like million times. Keep up with amazing content.
watching your videos my brain explodes - in a good way. Thanks for your videos!
The problem with Gaurav Sen's videos is that if its 17:22 minutes long then you need watch it exactly for 17:22 minutes to understand the concept. The moment you skip forward for 5 seconds anywhere then you need to rewind for 20 seconds.
WOW this is a great video, thanks man !!!! great help u did
Cheers!
I actually want to write my own database engine. This can help.
Thank you for the video and explaining it in a simplistic way!
Really nice video, few questions
1. What happens if the server goes down before we get a chance to flush the data from memory to the DB?
2. If we already know the ideal size of the segment to read the data efficiently, why not start with having the memory of the same size instead of opting in to merge the sorted segments of a smaller size?
Databases keep those entries in logs and they are called redo logs. In this situation, you apply redo logs to recover those data.
Greetings from Peru, keep up the good work Gaurav Sen
Thank you 😁
8:03 It's still O(nlog(n)) imo. Because in the worst-case scenario you'll go over every single chunk (n) and binary search on them (log(n)) as you can't really separate them into sequentially sorted chunks. (e.g first chunk holds sorted values from 1 to 50, a second chunk from 51 to 100...). This would require sorting every chunk in O(nlog(n)) time because we're getting random values in every query block.
"Revisiting is always great". I revisit this video after encountering real-world implementation from "RocksDB Internals".
an example maybe you could give here for a bloom filter is - considering the 0th and nth element of the sorted array. If I need to search for a number and if that number is within the range of the smallest and highest, there is a chance it could be there (it may not be there as well), however if the search number is either lesser than the a[0] (smallest number in the sorted array) or greater than a[n] (largest number in the sorted array) there is no way it could be there. Hope that helps.
That's an interesting way to think of it!
Thank you Gaurav! You are an amazing teacher! I super super like ur videos!
Thank you!
Great video class! Congrats buddy! ✌️
I thought you got this merge idea from your TIM SORT video. Nice explanation :)
Thank you Gaurav for ur amazing work!
Awesome as always! Keep uploading :)
Great content! Thanks a lot, Gaurav. Keep it up!
Thank you 😁
Nice editing with ONCE MORE
Hi Gaurav,
Pardon me if I understood something wrong but:
1. Around 7:45, if we have 10,000 sorted records (n) in SST and we are trying to insert 6 new records (m), why do we need to sort (10,000 + 6) records? Instead, can't we do binary search on 10,000 records to find insertion position (log n) of the 6 records?
Then time complexity will be = O(m * log n) = 6 * log(10,000) = 60
2. Also, if we go with the approach of merging the sorted arrays of size n (two arrays of 6 elements each merging into one array of 12 elements), why do we need to traverse all the arrays to insert a record x? Since the arrays are already sorted, we can easily compare starting and ending element of the array with x. If starting element is more than x or ending element is less than x, we don't need to traverse that array and can check another array where a[0]
This is very good content. Very good editing. Thanks 👍
Thanks gaurav for wonderful video... im your fan at first video itslf
Thank you! Very useful content.
Thanks, Gaurav. The idea is quite similar to Apache HBase.
I believe compaction uses optimal merge strategy similar to huffman coding. Take lists of smaller sizes first and then merge the remaining one's, which minimizes number of merge operations.
Is it? I thought the leveled compaction strategy was the most popular one.
Great video!!🤯
MORE MORE MORE MORE MORE!!! WE ALL LOVE U!!!!
I think thats how cassandra works to make writes faster .. mem table in memory and flushed to sst for good write latencies and using bloom filters/row and key caches to reduce read latency!
And that's where the leet code question of merging sorted arrays is going to help you in production !
u r inspiration
Thanks for the video!
You're welcome!
Hey gaurav! Amazing video! 😁
Awesome. Great video. Very nice explanation. Keep uploading. :)
very well taught ! thanks !
Excellent videos! What's your approach for becoming so knowledgeable about such a broad range of topics?
Isn't this how Cassandra works. Great concept
Yes :D
Nice informative video..!!
Nice editing as well...))
Thank you for the explanation.
I guess this is basically how Cassandra works as it uses SSTs?
Yes :)
Appending at the end of the LinkedList, the time complexity is O(N) (we need to traverse through the linked list and we need to append), Correct me If i'm wrong
You keep a track of the tail and keep updating it when there is an insert.
thanks, nice explanation
Thank you!
Hey can you make videos on size tiered and leveled compaction? There aren't great video resources on that. Will be very helpful.
Awesome bro...keep going
Hi Gaurav,
Thanks for the video. I have a doubt regarding constant time for write operation.
1) Whenever a write request is triggered it takes log m(where m is size of memtable) to insert in the memtable right?
2) write query is comprised of 2 operations right?
i) memtable
ii) log file
Great video! What would be the practical use of this concept?
what is the difference between a SST and LSM tree ? both are doing the same thing after all.
Gaurav, can you explain this by taking a real scenerio and writing pseudo code ?
Why. I feel that it's same as b+ tree overall.. because if we see b+ tree abstractly, it's also maintaining sorted chunks of array...but ya we don't have control over size and when to merge two sorted chunks in b+..
in b+ tree.. insertion is always log(n)..
While here it most of the time . constant..(I guess)
reading operation is bit heavy here..but I guess that filter can help to reduce time..
Anyway, nice video..
I like the theory you explain it. But how this can be applied. Is log and the sorted array something supported by RDMS ?
B+ tree implementations are used in most popular RDBMS.
Are the previous bloom filters of any help in the generation of the new bloom filter? As you said there is a high chance of having a lot of false positives in the new array if we don't increase the bits used by the new bloom filter. So, I'm assuming we need to again recalculate the new bloom filter discarding the previous two bloom filters with lower accuracies.
Yes, but the bloom filter can be made during the merge. That's O(N).
Please do a NoSQL and SQL system design for ECommerce
but searching on bloom filter means need to iterate query string and get hash value...... as the size of string increasing the operation will increase which can be greater than our binary search operation.
Can u make video with some practical DB?
Is it a better approach if we implement Bloom filter using TRIE data structure?
2:29 I don't get it, why do you care about read speed on the linked list? Doesn't the database have to read it in order? Or is it used as some kind of cache?
quite insightful enough , is it just me ? i realized some voice enactment of a news TV anchor 🤣
Which one? 😛
At 5:11 felt like arnab
Este tipo es muy joven y sabe bastante
which is better tri or bloom filter?????
Hi I know this is weird to ask but can you make a
"How to prepare for competitive programming for absolute begginer" video again , like a updated version and explain more
Go through errichto's channel on RUclips. He's a top competitive programmer and can provide you useful techniques.
(Not taking anything away from this channel)
@@ishaangupta4125 alright thanks
@@ishaangupta4125 you gave a very good information, just to add on top. Competitive Coding and Interview preparation are quite different, so even though topics are same but strategy of preparation is different
When notification arrives from gaurav..
Me: Hey, you don't know anything about this concept. Don't waste your time by watching this.
My Brain : you stop, gaurav makes the thinkgs clear..
When is sorted chunk created? Is it during insertion or sometime later ? What happen when its get a query for the data which is not yet in the sorted chunk ?
Could you make a user authentication, authorisation, roles and privileges system design video? Cant find any good tutorial on this!
I am on it!
Nice video 😃😃
Good one Gaurav !! Just a thought, isn't this is closely related to LSM trees?
It is an LSM tree :)
A large DB will typically be having millions of entries, so will the merging not be very inefficient in that case? Since merging two sorted arrays has TC of O(n1+n2). Even if this compaction happens async, it can still slow down the DB because it will happen frequently.
Merging millions of entries is very reasonable. It's "trillions of entries" which should concern you 😛
@@gkcs No of entries will depend on data size of each unit, but let's say that the data is 100GB. Sorting that much of data frequently will not be efficient.