Thankyou very very much for your appreciation ❤️ It really means a lot😁 Please do a favour by sharing it with your friends 🙏🏻 Best wishes for your exams hope this numerical will come for sure and you will crack it🔥
Heyyyy! Look this is an algorithm used for Big Data and in Big Data there might arise a case where any redundant or useless item may be present which is actually not a part of any transaction given. So just to check the validity of it, as per the algorithm it checks first whether the count is atleast 1 for every unique element in the transactions. I hope that helps you❤️ Please Share it with your friends😊 Best wishes for your exams! 💕
Great explanation! You are a fantastic teacher! One question, as I don't have much background in computers, a pass is when the data is read to the main memory (it is not shown in the main memory diagram but I'm guessing this is a given)? Then in between the passes the data is deleted out of main memory? Is it removed from main memory after the item counts are made in order to make room for the Hash Table pairs? Then the item count is reduced to frequent items and the Hash table to a Bitmap in order to make space to reload the dataset to main memory?
Thankyou for your great compliment❤️ You really asked a good question. So look, 1. Yes, you're right that a "pass" typically refers to reading data into main memory, which is an essential step in the PCY algorithm. While it might not be explicitly shown in the main memory diagram, it's indeed a given. 2. Data is not immediately deleted from main memory between passes. Instead, it's maintained in memory for multiple passes. The primary goal is to count the item pairs that occur together frequently during these passes. 3. After the item counts are made and you've identified frequent items, you don't necessarily remove all the data from main memory. Instead, you keep the necessary data for building the Hash Table, which is used to identify frequent item pairs. 4. You're right about reducing the item count to frequent items and converting the Hash Table to a Bitmap. This process aims to save memory space by only focusing on the frequent item pairs. The goal is to make room for the next dataset to be loaded into memory. Hope this helps! :)
Sir , i dont understand the use of hash function when we are not even using it to check the candidate key . You were only checking the thersode wven in the last step. Sir please clearify my doubt please sirrrr
Sir i have two doubts: 1) In Step 1,we have to remove elements with frequency less then 1 or we have to remove elements with frequency less than threshold value ? 2) Is the hash function is always fixed i.e.(i*j)%10 ? sir plse reply as soon as you see the comment..my exams are nearby !!
Answer to your doubts: 1) remove elements with a frequency less than 1, because it's an algorithm used to process "real time data" hence you need to check this condition also this step is just to calculate supports of every product. 2) Hash function is not fixed it can be any but it is generally chosen by considering opinions from domain expertise and dataset. My suggestion: choose hash function in such a way that you get different buckets for different products. Please Share it with your friends ♥️
@@ataglanceofficial one more doubt sir....sir as you told that we need to remove elements with frequency less than 1 in step 1...that means element with frequency 0....it means that the element will never present in the transaction ??
This seems wrong, see technically we are not even using the hash function to find C2 we are directly finding L2 by counting the pairs directly. The point of PCY was that we reduce C2 so only those pairs in C2 have their count to be calculated. If suppose we have millions of transactions, then we arent going to find count of all pairs, thats exactly what were trying to not do
Yes yes you are correct.... You can consider it like that✨ Hope it helps you! Thanks a lot for watching.... Please Share it with your friends ❤️ Best wishes for your exams 💕
Heyy look it doesn't mean that always the pair has to be side by side in the transactions. You can create your combinations from each transaction. Now in T3 we have 1,4,5 so we can have pairs [(1,4),(1,5)< (4,5)] similarly in T4 we have items as 1,2,4 hence we can make pairs like this [(1,2),(1,4),(2,4)] Now out of these two transactions T3 and T4 we get the count of pair (1,4) as 2 since it is repeated in both the transactions🙌🏻 Hope now this will definitely clear your doubts! 💕
Your channel is underrated, concept was nicely explained with good ppt presentation.
Heyy.... Thanks a lot for appreciating 🙏🏻❤️ Please Support my channel by hitting Like, Subscribe and please Share with your friends! 🙌🏻✨
Finally I got a video which explains the PCY topic very nicely. I sense the entire Big Data Playlist is so interestingly made❤
Glad to know that you liked my Playlist and videos! Please stay connected to my channel and Share it! 💖
Great as usual. The numerical examples is incredibe
Thankyou so much for appreciating ❤️🙏 Please Share it with your friends!
No, like why is this so underrated 😢, i loved this, you cleared the concept, i hope the numerical comes in exam though lol
Thankyou very very much for your appreciation ❤️ It really means a lot😁 Please do a favour by sharing it with your friends 🙏🏻 Best wishes for your exams hope this numerical will come for sure and you will crack it🔥
@@ataglanceofficial the paper was all theoretical but mine still went well:)
Woahh that's great! 🥳 Amazing!
Very good presentation
Thanks for your valuable comment ma'am 😊
Your content are best✨
Thankyou so much for your appreciation ❤️🙌🏻 Please Like, Subscribe and share with your friends😊
amazing just amazing wow...
Thank you so much 😊 Please Share 🙏
10:00 Why are you using the condition of item count less than 1 when the given threshold is 2?
Heyyyy! Look this is an algorithm used for Big Data and in Big Data there might arise a case where any redundant or useless item may be present which is actually not a part of any transaction given. So just to check the validity of it, as per the algorithm it checks first whether the count is atleast 1 for every unique element in the transactions. I hope that helps you❤️ Please Share it with your friends😊 Best wishes for your exams! 💕
@@ataglanceofficial okay thanks
Very very nice tutorials ❤️❤️💫💫🔥🔥😍😍
Thankyou soooo much for your kind words! Please Subscribe and Share it with your friends 🙏🏻 Best wishes for your exams 💕
Great explanation! You are a fantastic teacher! One question, as I don't have much background in computers, a pass is when the data is read to the main memory (it is not shown in the main memory diagram but I'm guessing this is a given)? Then in between the passes the data is deleted out of main memory? Is it removed from main memory after the item counts are made in order to make room for the Hash Table pairs? Then the item count is reduced to frequent items and the Hash table to a Bitmap in order to make space to reload the dataset to main memory?
Thankyou for your great compliment❤️
You really asked a good question. So look,
1. Yes, you're right that a "pass" typically refers to reading data into main memory, which is an essential step in the PCY algorithm. While it might not be explicitly shown in the main memory diagram, it's indeed a given.
2. Data is not immediately deleted from main memory between passes. Instead, it's maintained in memory for multiple passes. The primary goal is to count the item pairs that occur together frequently during these passes.
3. After the item counts are made and you've identified frequent items, you don't necessarily remove all the data from main memory. Instead, you keep the necessary data for building the Hash Table, which is used to identify frequent item pairs.
4. You're right about reducing the item count to frequent items and converting the Hash Table to a Bitmap. This process aims to save memory space by only focusing on the frequent item pairs. The goal is to make room for the next dataset to be loaded into memory.
Hope this helps! :)
❤️🚀
♥️😊
Great video man thanks!❤️
Thankyou so much for appreciating🙏🏻😇 Please Share it with your friends ❤️ Best wishes for your exams🔥
Nice ❤
Thanks a lot Mr. Tabish❤️😄
@@ataglanceofficial ❤😍
Finally got better results comparing other content
Thankyou so much for appreciating 🤗 Please Share it with your friends ♥️ and also with your juniors too✨
Sir , i dont understand the use of hash function when we are not even using it to check the candidate key . You were only checking the thersode wven in the last step.
Sir please clearify my doubt please sirrrr
Heyy, I would request you to watch the full video carefully.. you will understand why I did so! Thanks for watching ☺️
Sir i have two doubts:
1) In Step 1,we have to remove elements with frequency less then 1 or we have to remove elements with frequency less than threshold value ?
2) Is the hash function is always fixed i.e.(i*j)%10 ?
sir plse reply as soon as you see the comment..my exams are nearby !!
Answer to your doubts:
1) remove elements with a frequency less than 1, because it's an algorithm used to process "real time data" hence you need to check this condition also this step is just to calculate supports of every product.
2) Hash function is not fixed it can be any but it is generally chosen by considering opinions from domain expertise and dataset. My suggestion: choose hash function in such a way that you get different buckets for different products.
Please Share it with your friends ♥️
@@ataglanceofficial
sir what if ..suppose we are using hash function (i*j)%10 and we got same value for two buckets ?
It's okayyy to get the same values of bucket for more than one product.. I just told you a suggestion..
You can go for both
@@ataglanceofficial
ok thanks sir for clearing my doubt and also for giving instance replies.....Thanks a lot sir 🙏
@@ataglanceofficial
one more doubt sir....sir as you told that we need to remove elements with frequency less than 1 in step 1...that means element with frequency 0....it means that the element will never present in the transaction ??
This seems wrong, see technically we are not even using the hash function to find C2 we are directly finding L2 by counting the pairs directly.
The point of PCY was that we reduce C2 so only those pairs in C2 have their count to be calculated.
If suppose we have millions of transactions, then we arent going to find count of all pairs, thats exactly what were trying to not do
Expecting an explanation from your end at a glance, btw salute to you for the videos, helped me a lot
How to construct hash functions accordingly ???
1,4 has no pair
Heyyyy! In transaction T3 and T4 you can find (1,4)
So should we also consider pair who are not next to each other ?? Like in T4 1 and 4 are not next to each other
Yes yes you are correct.... You can consider it like that✨
Hope it helps you! Thanks a lot for watching.... Please Share it with your friends ❤️ Best wishes for your exams 💕
@@ataglanceofficial plz check again
Heyy look it doesn't mean that always the pair has to be side by side in the transactions. You can create your combinations from each transaction. Now in T3 we have 1,4,5 so we can have pairs [(1,4),(1,5)< (4,5)] similarly in T4 we have items as 1,2,4 hence we can make pairs like this [(1,2),(1,4),(2,4)]
Now out of these two transactions T3 and T4 we get the count of pair (1,4) as 2 since it is repeated in both the transactions🙌🏻 Hope now this will definitely clear your doubts! 💕