How Google searches one document among Billions of documents quickly?
HTML-код
- Опубликовано: 28 июн 2024
- Understand the data structures and internal process of how search engines index the documents and made them easy to search
#searchenginedesign #searchsystemdatastructure #designsearchengine #invertedindexes
#systemdesigntips #systemdesign #computerscience #learnsystemdesign #interviewpreperation #amazoninterview #googleinterview #uberinterview #micrsoftinterview
Great article. Keep these coming, Sir :)
Nice work with laying out the basic building blocks if any search engine! (especially, the inverted index section)
I can't find such a content anywhere on RUclips. These videos shows the hard work behind it. U r really great.
Thnq very much..🙌🙌
Great content on this channel bro..keep up the good work
You are excellent at explaining System Design. Watching your videos makes me motivated to learn in depth and my learning experience interesting. There is so much to learn from your videos also shows that you enjoy teaching and sharing your immense knowledge.
Your channel is one stop shop for SDI prep. Thank you for another great video. Could you please share system design for YELP like service.
Cheers!
Bhai you are awesome!!! I cleared so many interviews because of you!! Sending you more power and may all your wishes come true!
Grt, its good knowledge on Inverted index search. That' cool.
But was expecting bit more on Actual design. System diagram flow etc.
Your videos are great!
really cool video man, keep such videos coming !
What an amazing simplified explanation of such a big concept. Thanks for sharing your knowledge.
11:00
18:17 noise removal
26:35 indexMap[keyWord]=[ { 哪個sentenseIdx , 字串中出現的pos} , ]
31:20 Conjunction=And / DisConjunction OR 31:25 remove duplicate
34:35 Union三連字要看順序
Thx
This guy has done so much hard work to provide free education. Still, a few people disliked this video. Shame on them.
Top content. Thank you!
While watching the video i can observe increase in my knowledge .What an explanation!
Thx Narendra for this wonderful eposide
Very nice explanation sir 👍. Thank you very much.
Can't explain in words how much useful is this video, thanks a lot!!!
Thank you for sharing! Very clear and learner centered. What an amazing educator!
This was really helpful. Thank you so much. :)
Great super explanation...
Thank you for your interesting video. Thank a lot.
thank you its a great video
Very good channel I am enjoying every bit of learning here !
Bro , you are doing such a great work that too free of cost.
Wow..You are great..
Very helpful, thank you very much!
nice explanation😊
amazing.........just amazing......thank u narendra!!!
Amazing video!
Such an amazing video, thank you so much Narendra.
Narendra Sir is a Legend RUclipsr in Sys Design. Very Thorough and knowledgeful videos
Awesome explanation :-)
Very clear! Thank you so much for your work 😌
"Basically" I got your point brother! 😂
awesome ....
really enjoyed the illustration of Indexing
Nice info...!!!!!.
You sir, are a legend! 🙏🏻
Am just a normal guy !!
Very nice explanation sir
bhaut acha guru
Awesome 👏
Great Lecture sir !!!
I enjoyed listening and learning😇😇
Great content ... great explanation ... it is difficult to find such content on youtube.
So basically, it is a great video!
Basically 😂
😂😄
Basically it is
great video man!
Great contents. Thanks for your video.
really good video
good explaining ...
So great
Great
Thank you so much.
You earned a sub Sir
The b tree indexing has time complexity of O(logn) as it performs binary search, why you told that it takes O(1)?
Correct. What was explained is caching not indexing but overall video is very helpful.
👍
Once that "basically" sets in,it's hard to get over
I literally counted how many times "basically" is used. I appreciate the fact that people like him are coming out of their way to contribute to society ,but it would have been more of a sounding content iff this was planned and represented even more methodologically.
this is really informative and thank you for the kn0wledge this will really be helpful for me as I'm into digital marketing Thank you do much sir really loved the way you explained
Can you please increase the volume, its hard to listen from my laptop.. Noticed same thing with your previous videos as well.. But Nice and informative videos..
Best explanation.
Please upload POS system.
Great content man, well explained, only suggestion is to work on audio quality, even after max volume its not properly audible
Nice explanation :) , one small suggestion where the order of the words matter is for example query: "Distance between Mumbai to Delhi". This video helped me to brush up information retrieval techniques.
When you type "Distance between Mumbai to Delhi", I guess the expectation is to get a number ("26 hr (1,419.7 km) via NH 48") as an answer, and not the documents containing those words. Essentially, we would like the search engine to understand the meaning of our question, and then respond with a factual answer. I suspect that is handled by different techniques like "Semantic search" or "extractive question answering".
How do you gather such in-depth details of these systems? Really great work and I'm enjoying it.
You can too, keep reading :)
@@TechDummiesNarendraL what books and resources do you recommend?
@@mmanuel6874 Best place is the papers published by these companies and their tech talks.
@@mmanuel6874 Google!
Its a great video Naren.. but would like to understand how does refactor/removal of index/keyword happens, also does the Google store index across multiple regions eg: APAC, AMERICA... How big the index can be incase of the Google
@NArendra, thanks for the detailed explanation. Nice work. So if there ia a new document with the word borwn, how do you update the index? IS updating the index more expensive? Can you do incremental load of the index?
Thanks for taking time to make this video. Basically 😂 explained very well .
Very much informative, thanks for posting this video. One question: is this how all search engine works eg: Windows File System Search, outlook/gmail search, web search?
Make bit indexing between words and docs and then it will be easy to do conjuction and disjunction
Great Video for explaining the nitty gritty details of inverted search. If you at least show the full flow of the design, would be much appreciated .
If you put a donation link on your videos I would send you money at this point, your vids have helped me so thouroughly!
Nice explanation of TF-IDF preprocessing. But not clear how ranking and ML are organized
🤩🤩🤩
@Tech Dummies Narendra L, it's ok to build a matrix for 3 documents. How about a case where there are a billion docs to scan through and build an inverted index matrix. The matrix would be huge right?
can you make a new video, since i think searching has moved quiet ahead from tf idf to gen ai embeddings and rag methods
Great job!!
One important suggestion. Use external microphone to cut off background noise and better sound quality.
Otherwise your Hard work would not be paid off.
Sure, I will have to buy one
Awesome stuff.One doubt:
in last minutes prefix search, i think if we store keys by sorted order.it will make insertion complex,search through the table using binary search and then insert.Btrees/trie can be used but they are not as efficient to search as hash.what we can do is to search jum(a) jum(b) and so on upto z.Since there are 26 apphas and limited number of character apart from alpha.It is more efficient than maintaining sorted list.
jum* doesnt tell us how long the word would be so just searching for jum(a), jum(b) and so on would not work I think
can we have a video on dream11 architecture
Great Video again Narendra! I am not sure what is the purpose of frequency in this table. It looks like we will never use it when querying. Also why cant the data structure to store the index be Map. This way you can have very fast lookups. But I agree you cant do regex type matching in it.
I suspect the frequency is used to understand the semantics. A word and words that appear together or frequently around it are all useful information to gauge the relevance of a document.
how you learn this thing can you please suggest any book to refer,
how search engine work how to build it
If there are millions documents have these words. You cannot run intersection in memory. That is one critical part we need focus on. How to make it work?
how can we update this table when new stuff is added?Shall we store it in redis/cache to speed up?
Noise removal got slipped ... But it's ok I figured it out... Nice lecture dude.
He did at 17.50
@@helloworld6679 yes yes yes u r rt dude .... Perfect ...
can you shed some light on how intersections are performed when you have billion of docs ?.
How the millions of queries will be searched over Reverse index . How the concurrency achieved ? Union and intersection where it is happening and how quick it is . Can query go to the exact shards based on query words and perform union ?
Thanks for the great video and explanation. There is an issue in the term document table. Word "Quick" appears twice in that table at row0 (as Quick) and row8 (as quick) but has different values. this is bit confusing as I am not sure if I am missing something.
yes , you can to toLowerCase() , its was a mistake. you can ignored that
Basically this is a video on how to use basically in all sentences you say
How does fuzzy search works? What if I write Quock instead of Quick? How does that comes as a result?
Out of topic, did you ever wondered the information retrieval from your brain. You can bring back the incident/emotions/moments in milliseconds and the brain is not power intensive at all. What an amazing design our brain has.
The brain operates at about 20 watts; this is about 30% more than a Raspberry Pi 4B at theoretical consumption (5V @ 3A = 15w)
I don't think u have included technical concept. And technology they use.
Great content... Voice not so clear
What if we search for "Jumped", Do we expect search engine to lemmatize search keyword?
Suppose that words "quick", "brown" and "fox" are associated with millions of documents. How would you find an intersection of them in a fraction of a second?
Just compare the 'bits' ...
"basically" it is nice tutorial xD
Please improve the audio quality 👏
basically
Great video! A bit of trivia - "the quick brown fox jumped over the lazy dog" should have an 's' in it -> "the quick brown fox jumps over the lazy dog". It's a typing exercise to hit all 26 letters. en.wikipedia.org/wiki/The_quick_brown_fox_jumps_over_the_lazy_dog. ;)
can't we use a trie here?
Can u make video about networking deeply
Some time later :)
Basically, the basicality of the basic is quite basical.
You got me
Tech Dummies great videos. I love them and subscribed. Thank you 🙏
You basically like a lot to say basically hun? LOL. Thank you for the content, keep it up.
You need better lighting, it's too dark
What is the point of storing frequencies?
it is relevant the search result rank. tf-idf, bm25 concept will help you