@arpit35007 what do you mean by that ? and how do we calculate 8 neighbours ? and what exactly are these neighbours ? are these 8 nearest quadrants which have multiple businesses in it ?
@@tanaygupta632 you'd want to calculate neighbors for the bounding problem. imagine a location is at the edge of a quadrant. you'd want locations at the next quadrant as well since those locations could be closer to the user than any other locations in the same quadrant, so you'd want to get all locations in adjacent quadrants and then sort on distance to the query. there are libraries you can use to get neighboring hashes for any given hash
Bloody hell, the idea about geohash4, geohash5, geohash6 is so simple, but at the same time is so genius (by the time of writing the message I've already watched a few videos about geohashing and proximity servers). I had plans with my girlfriend to do some Netflix & Chill, but I guess I would rather continue watching your other videos on Systems Design. Superb stuff!
I like this. Sometimes it's good to always see how to optimize solution. I was jumping for joy that I could use the LIKE and boom, the multiple column nailed it.
Very High Clarity in the thoughts and presentation, Would be very helpful if we extend for 5 more minutes covering the Quadtree or Google S2 based approach as well
Thanks for the great video. I think it would be even better if the video mentioned the scenario where the user could be near the edge of the grid and should also return the neighbour grids. Or we should always return the neighbouring grids if it's too much work to figure out if the user is near the edge of the grid or not.
If the user is located near the boundary of a big cell 9a, the method proposed in the video won't get all those locations which are in proximity of the user but belonging to an adjacent big cell (9b, for example). How would you work around this issue?
Since these are stored as prefixes, can we use trie data structure here? If so, why don't we use it? We can use graph database and store every prefix as node, and the next prefix (the deeper levels) as children. But looking at it I don't see any trie-based geohashing implemented in DBMS, tho.
You need a connector. MySQL will write about all the changes happening in a bin log I believe. You need a connector to read/format from this log file and then write formatted messages to Kafka.
@irtizahafiz Thank you for such a crisp and clear video. It was very helpful. I have a question about how the data is being stored in cache. We are storing list of business ids. so when a new business gets added wont it be difficult to insert that new id into the existing list? Also you showed the post processor to add into the Cache. What if the processor dies after consuming the message? How do we avoid duplicates in the cache then? I would really appreciate if you can answer these questions
1. Redis and some other caching mechanism gives you an easy way to append to a list. 2. You could use a set. For example, Redis gives you a set data structure to use in-memory.
Hi! That's a really good question. I did go down the rabbit hole of understanding how they differ from each other. I can make a video about it in the future : )
This caching design assumes all business categories are homogenous. How might this support say searching for just coffee shops around a location instead of all businesses at location, that might include laundromat, etc.
It's a great video man. One thing that should be added is, we just don't lookup for one geohash, we also calculate 8 neighbours and then do a lookup.
@arpit35007 what do you mean by that ? and how do we calculate 8 neighbours ? and what exactly are these neighbours ? are these 8 nearest quadrants which have multiple businesses in it ?
@@tanaygupta632 you'd want to calculate neighbors for the bounding problem. imagine a location is at the edge of a quadrant. you'd want locations at the next quadrant as well since those locations could be closer to the user than any other locations in the same quadrant, so you'd want to get all locations in adjacent quadrants and then sort on distance to the query. there are libraries you can use to get neighboring hashes for any given hash
Bloody hell, the idea about geohash4, geohash5, geohash6 is so simple, but at the same time is so genius (by the time of writing the message I've already watched a few videos about geohashing and proximity servers). I had plans with my girlfriend to do some Netflix & Chill, but I guess I would rather continue watching your other videos on Systems Design. Superb stuff!
Haha! You clearly got your priorities sorted out. Glad they are helpful to you.
Woow, you have even shared your noted!! That's perfectly perfect!
Very informative and explained practical usage of handling maps via geohash. Thank you for this video.
One of the best system design video I have ever seen, period. Let alone being best SD video on proximity service. Awesome explanation. Mind=blown
Glad you liked it!
I like this. Sometimes it's good to always see how to optimize solution. I was jumping for joy that I could use the LIKE and boom, the multiple column nailed it.
Very High Clarity in the thoughts and presentation, Would be very helpful if we extend for 5 more minutes covering the Quadtree or Google S2 based approach as well
Loved it. Such an underrated channel.
Much appreciated!
Very clearly explained. The best system design video i have seen.
Really happy to hear that : ) Let me know if you have any feedback.
Amazingly explained
This is Gold!... where you have been all these years mate!. If I make a app with this knowledge ..I might share the royalty as well... ... Thank you
Haha! So glad to hear that. Let me know if you ever end up making an app. Would love to see how you apply these concepts.
Amazing explanation!!
Please keep up the sensational job
Thank you for watching!
I usually watch your videos on 1.5 speed and background music sounds like some crazy mobile space arcade :D
LOL yeah I think I stopped adding music after getting multiple feedback.
This is awesome. Best youtube channel regarding system design. Keep it up!
That means so much! Really appreciate it. Hope you continue getting value out of the content.
Thanks for the great video. I think it would be even better if the video mentioned the scenario where the user could be near the edge of the grid and should also return the neighbour grids. Or we should always return the neighbouring grids if it's too much work to figure out if the user is near the edge of the grid or not.
Huh, I thought I mentioned that. Maybe not then. Thank you for pointing it out.
Amazing content, very clearly explained all nuances and approaches !
Glad you think so!
thanks, very informative.. keep up the good work 👍
Thanks for watching!
Your content is amazing and so helpful.. Thanks for all the efforts that you have put it in , and making our lives easy :)
Thank you for the kind words. Hope you enjoy the upcoming videos as well.
Great stuff. Thank you for putting your efforts on this topic.
Glad you found it helpful!
Great content and well articulated, keep the great job going man!
Much appreciated! Let me know what else you are interested in.
nicely explained. thanks!!
Thank you! I will start posting again soon, so please let me know what type of content interests you the most.
ive been waITING FOR YOU!
Around 22 or so I would name columns like geohash_1, _5, _10, so you could dynamically interpolate when the query comes in.
If the user is located near the boundary of a big cell 9a, the method proposed in the video won't get all those locations which are in proximity of the user but belonging to an adjacent big cell (9b, for example). How would you work around this issue?
IIRC one solution is to fetch data from N neighboring cells instead of the chosen cell only.
Any idea to locate a business in movement like taxi or distributors .. ?
amazing , kudos to you
Thanks a lot!
Special thanks !
Awesome!!
Since these are stored as prefixes, can we use trie data structure here?
If so, why don't we use it? We can use graph database and store every prefix as node, and the next prefix (the deeper levels) as children.
But looking at it I don't see any trie-based geohashing implemented in DBMS, tho.
I am not really familiar with trie-based geohashing DBs either.
I think you should have discussed the edge cases of using geohash
I believe I have another video on Geohash. That's why didn't go into depth here.
@@irtizahafiz you didnt discuss edge cases there either :(
Very good.
What is the Tool/ IDE you are using , Its very nice, i also want to start using.
Hi! I am using Obsidian for the actual notes, and Miro for the diagrams.
obsidian.md/
miro.com/
Both are free tools that you can use : )
Does MySQL come with CDC feature automatically? If not, how is CDC implemented between mySQL and kafka?
I think there is connector for that Debezium
You need a connector. MySQL will write about all the changes happening in a bin log I believe. You need a connector to read/format from this log file and then write formatted messages to Kafka.
@irtizahafiz Thank you for such a crisp and clear video. It was very helpful. I have a question about how the data is being stored in cache. We are storing list of business ids. so when a new business gets added wont it be difficult to insert that new id into the existing list?
Also you showed the post processor to add into the Cache. What if the processor dies after consuming the message? How do we avoid duplicates in the cache then?
I would really appreciate if you can answer these questions
1. Redis and some other caching mechanism gives you an easy way to append to a list.
2. You could use a set. For example, Redis gives you a set data structure to use in-memory.
Raise hand, what is the difference between Geohash and Quadtree, could you record a video to tutor?
Hi! That's a really good question. I did go down the rabbit hole of understanding how they differ from each other. I can make a video about it in the future : )
This caching design assumes all business categories are homogenous. How might this support say searching for just coffee shops around a location instead of all businesses at location, that might include laundromat, etc.
You can cache results sliced by business category, though you will be using up more space.
The note shared seems to editable by anyone, it may get lost if someone click the wrong button.
👍👍