I learnt about UnionFind from this video. A word that means nothing to me 2 months ago! Thank you for the excellent explanation. You are much better at teaching than my college professors.
Union Find is a cool algorithm but the DFS is so much less code, so less chances of errors and typos. It applies to way more problems too. I was doing number of provinces and the code was way more complicated with union find
Just wanted to say thank you for these videos man. I've been putting off the leetcode grind for a couple of years now and your videos are the clearest on youtube by far and are making learning a breeze! :)
Some Thoughts: Use Union-Find when you have a large, sparse graph and need efficient connectivity checks and merges. It is especially suitable for dynamic graph problems where the structure of the graph changes frequently. Use DFS when you need to traverse the graph, find paths, or need more information about the graph's structure. It is also a good choice for dense graphs and for problems where readability and simplicity are more important.
I find using DFS, VisitedSet and a Counter to be a much more natural way to solve this to me. This is how i approached it below. But it is good to know Union Find algorithm too in case it ever comes in handy. # Create an adjacency list to represent the graph graph = {i: [] for i in range(n)} for edge in edges: graph[edge[0]].append(edge[1]) graph[edge[1]].append(edge[0]) # A set to keep track of visited nodes visited = set() def dfs(node: int): # Mark the node as visited visited.add(node) # Visit all the neighbors for neighbor in graph[node]: if neighbor not in visited: dfs(neighbor) # Counter for the number of connected components components = 0 # Iterate through all nodes for i in range(n): if i not in visited: # If a node hasn't been visited, it's a new component dfs(i) components += 1 return components
it's also more efficient. Only reason to use union find is if you're dynamically tracking the components in a graph where you are dynamically adding edges, like in the kruskal algorithm
This is super awesome! I have been using BFS or DFS traversal for pretty much all work-related graph problems so far. I have learned something new today!
@@jacked-boy oh I'm working on projects that need to deal with 3D meshing, and other geometry related algos. But honestly this is (union find) such a beautiful algo for finding disjoint sets for my problems. Hope this helps!
My initial thought was to pass the parent array to a set and return the length of it. This way I thought we can get all the unique parents and it will match the number of connected components, But your trick to decrement the n on each union is very nice.
This doesn't work unless you iterate over all "nodes" (1 through n) at the end and compress the paths to make sure every node is at depth 1 from the root in each component. What you can do however is iterate through the nodes and simply check if the parent of that node is equal to the root... this will tell you how many root nodes there are which equals the number of components.
Amazing video with detailed concept illustration! I found a recursion could be used in find function. def find(n1): # find the parent of n1 if n1 == par[n1]: return n1 # path compression par[n1] = par[par[n1]] return find(par[n1])
Just a little bit of change to your code can make it better actually. It can guarantee the height of the tree is no more than 2 def find(n1): # find the parent of n1 if n1 == par[n1]: return n1 # path compression par[n1] = find[par[n1]] return par[n1]
NIce video - I think the key takeaway for the complexity analysis is that this algo would be running in O(E * V) Time (where V is no of node aka N) if implemented without that par[res] == par[par[res]] line. Since you added that optimization the traversal of the nodes is cut by half leading to a O(E * log(V)) Time which is why Union Find is more optimal. :)
Why we need the while loop in the find function? I thought we are always setting the ultimate parent in parent array. For example, in drawing explanation, after union 2, we set par[2] to 0, not 1
On union, the algorithm only updates the root node parent of the tree being merged into the larger tree. The while loop is necessary because of that. Imagine connected components 1-2 being merged into connected components 3-4-5. After the union, parent[1] will be 3 but parent[2] will still be 1. The union only updates the parent of the root merging into the larger tree. The path compression limits the depth of that while loop over time by effectively smashing the tree down, but that happens on the find, not on the union, so because of that, the while loop is necessary on the find.
I think we should only change the rank when they are equal and increment it by one, because all the other cases we can still make a tree of max rank by joining to the root of the biggest tree. but if the ranks are equal the biggest tree height is going to increase by one because we are adding an edge.
You don't have to update the rank of p2 when (rank[p2] > rank[p1]) because we are only adding one node at a time, so the length of (root p2 node) is either gonna stay less than it's rank or become equal to the rank.
DFS solution that remains consistent throughout all problems: class Solution: def countComponents(self, n: int, edges: List[List[int]]) -> int: res=0 visit=set() graph = {i:[] for i in range(n)} for n1,n2 in edges: graph[n1].append(n2) graph[n2].append(n1)
def dfs(i): if i in visit: return False if i in cycle: return True
cycle.add(i) T=1 for j in graph[i]: T = T and dfs(j) return T
for i in range(n): cycle=set() if dfs(i): res+=1 visit.update(cycle)
Thank you Neetcode! I don't see why you are using the rank array because it doesn't really matter; once we see that their parents aren't the same we can connect them. I posted my code below as a reference. class Solution(object): def countComponents(self, n, edges): par = [i for i in range(n)] def find(n): if n == par[n]: return n return find(par[n]) count = n for n1, n2 in edges: p1 = find(n1) p2 = find(n2) if p1 == p2: continue else: par[p2] = p1 count -= 1 return count
for 547 -> class Solution: def findCircleNum(self, isConnected: List[List[int]]) -> int: n = len(isConnected) parent = [i for i in range(n)] rank = [1] * n
def find(i): p = parent[i]
while p != parent[p]: parent[p] = parent[parent[p]] p = parent[p]
Practice for free on geeksforgeeks.org (Note: Problem is named differently). * "Number of Provinces": practice.geeksforgeeks.org/problems/number-of-provinces/1/ * Theory: www.geeksforgeeks.org/connected-components-in-an-undirected-graph/
Your videos are amazing, thank you for the clear and detailed explanations. could you please share the spreadsheet link? You mentioned in the video that it would be in the description, but I couldn't find it though.
Thank you very much for this video. I devoured the union find in one go. I usually tend to avoid this:(. Now going to Alien Dictionary. Topological sort 😁
line 10 to 11 in the code, should be replaced with this right? else there's no use of the while loop temp = par[res] par[res] = par[par[res]] res = temp
Dope! When I first saw the video I was like "too easy, just dfs blah blah ...". Haven't seen anyone solving it using DSU. Thank you sir! One question though, you don't need rank to solve this problem do you?
I completely got rid of the find function, and just used the parents array. The solution still works. What's going on here? class Solution: def countComponents(self, n: int, edges: List[List[int]]) -> int: par = [i for i in range(n)] rank = [1] * n def union(a, b): p1, p2 = par[a], par[b] if p1 == p2: return 0
if rank[p1] > rank[p2]: par[p2] = p1 rank[p1] += rank[p2] else: par[p1] = p2 rank[p2] += rank[p1] return 1 res = n for n1, n2 in edges: res -= union(n1, n2) return res
We want to increase the rank, or size, of the connected component with parent p2 by the rank of the connected component of p1. The connected component of p1 might not always be rank of 1. It might have 2 or 3 or more nodes in the component.
How does the path compression not jump to far into the future? `par[res] = par[par[res]]` seems like it should skip to far, but instead since the parent index is also it's value it stays at the same element. For example, if par[res] = 5 and res = 5 then par[par[par[par[5]]]] == 5 because the 5 element is also 5 so it just keeps looping.
suppose you are looking at par[5] = par[5] # now you are at the same level par[5] = par[par5]] # now you are only going one level deep, which is something you are gonna do in the next jump anyway.
Python hashmap only version: class Solution: def countComponents(self, n: int, edges: List[List[int]]) -> int: parent_map = { i: i for i in range(n)} subtree_size = { i: 1 for i in range(n)}
num_components = n for starting_node, ending_node in edges: if self.union(starting_node, ending_node, parent_map, subtree_size): num_components -= 1
Pro Tip: When you reach a point in your narration where you catch yourself saying something to the effect of, "I didn't mention this before..." you should probably go back and mention it. Case in point: 9:1012:20
A small note about the rank array: this approach works for keeping track of which node's rank is bigger, but it becomes inaccurate because we need to reset rank of merged nodes to 1. Say, for example, you wanted to know about the sizes of the connected components, then this would become important.
Dodged the union find in this video using DFS, coming back after few days for another problems that needed union find xDDD So yeah, don't skip any new algorithm you saw!
the ranking part confused me of why it is being used. When performing the union, why does it matter if you set the parent of the of the smaller ranking to the bigger one? since the result is just the number of nodes minus unions you can just get rid of the entire ranking system correct? LC accepted it. Am i missing something? Using your code and removing ranking it would look like this: def countComponents(self, n, edges): par = [i for i in range(n)] def find(n1): res = n1 while res != par[res]: # path compression par[res] = par[par[res]] res = par[res] return res def union(n1, n2): p1, p2 = find(n1), find(n2) if p1 == p2: return 0 else: par[p1] = p2 return 1 res = n for n1, n2 in edges: res -= union(n1, n2) return res
Minimize Tree Height: The primary purpose of using rank is to minimize the height of the trees that represent the sets. By always attaching the smaller tree to the root of the larger tree, the height of the tree increases only when two trees of the same height are united, and even then, the increase is by just one. This helps in keeping the trees as flat as possible. Optimize Find Operation: The efficiency of the find operation depends on the height of the tree because the operation requires traversing from a node up to the root of the tree. By minimizing the height of the trees, the find operation becomes faster, as it needs to traverse fewer levels.
I totally think the same here. imo union find is more tricky to catch. for example : from collections import defaultdict class Solution: def countComponents(self, n: int, edges: List[List[int]]) -> int: count = 0 seen = [False] * n graph = defaultdict(list) for a,b in edges: graph[a].append(b) graph[b].append(a) def dfs(i): seen[i] = True for node in graph[i]: if not seen[node]: dfs(node) for i in range(n): if not seen[i]: count+=1 dfs(i) return count
don't get me wrong, I love those videos. I learn so much for them, and I love the fact that it gives another opportunity to learn union find. But i was just meaning that it might be more trivial to understand the alternative imo. Thanks again for those amazing videos Mr Needcode.
@@doronbaruch665 hey thank you for the dfs code. May I know why you used a defaultdict instead of a ordinary list? I was initially coding like this, I couldn't figure out why it returns what it returns and why everyone else is using a defaultdict. Thank you in advance! n = 5 edges = [[0,1],[1,2],[3,4]] adjList = [[]] * n for i in range(len(edges)): adjList[edges[i][0]].append(edges[i][1]) adjList[edges[i][1]].append(edges[i][0]) print(adjList)
Hey great solution, but still not the got the reason why is it necessary to the grandparent.????? And also please dry run the code once you write it as it gets easier for us to understand.
I suspect that this code has a bug. If 1-9 connects and 3-4-5 connects. Then an edge 1-4 comes, then 3's parent will remain 3 but 3's parent should have been 1. Is that right?
For the purpose of just calculating connected components, it really doesnt matter but for problems where actual parents matter, Union find isnt the right approach.
rank[p2] += rank[p1] this will work but according to concept of union find it is wrong. p1 and p2 is like subtree and when we are combining these two tree, we need to make root and that will increase the rank by 1 not addition. rank[p2]++;
When speeding up the find function with parent[p] = parent[parent[p]], why does this line do nothing if the grandparent doesn't exist? I would expect it to return an error.
The grandparent will always exist, because even if it's not pointing to a different index (node), the grandparent of p will be the parent of p-as a sort of circular pointer. This is similar to how, in the beginning of the code, we set each parent to be itself, in the line `par = [i for i in range(n)]`. What helped me understand this is to think of parents as pointers to nodes, instead of nodes themselves. Watching/solving LeetCode 287 "Find the Duplicate Number" is helpful in understanding this, as it similarly treats the indexes of an array as though they were nodes in a graph.
## BFS and DFS solution comment either one to use class Solution: def countComponents(self, n: int, edges: List[List[int]]) -> int:
visited=set() q=collections.deque() count=0 hmap=collections.defaultdict(list) for u,v in edges: hmap[u].append(v) hmap[v].append(u) for node in range(n): if node not in visited: ##BFS or DFS count=count+1
self.bfs(visited,q,node,hmap) self.dfs(visited,node,hmap) return count def bfs(self,visited,q,node,hmap): q.append(node) visited.add(node) while(len(q)): curr=q.popleft() for child in hmap[curr]: if child not in visited: q.append(child) visited.add(child) def dfs(self,visited,node,hmap): ## Base if node in visited: return visited.add(node) for child in hmap[node]: self.dfs(visited,child,hmap)
I couldn't understand find. Without find it works fine class Solution: def countComponents(self, n: int, edges: List[List[int]]) -> int: par = [i for i in range(n)] rank = [1]*n def find(node1, node2): if par[node1] == par[node2]: return 0 else: if rank[node1] >= rank[node2]: par[node2] = par[node1] rank[node1] += rank[node2] else: par[node1] = par[node2] rank[node2] += rank[node1] return 1 res = n for node1, node2 in edges: res -= find(node1, node2) return res
The code about rank/size is misleading, - rank should be += 1 each time when rank are the same as it tracks the depth of the tree (init to 0s) - size on the other hand tracks total node connected (init to 1s)
Thanks for making this video. It seems we don't need to consider the rank of components in this problem: class Solution: def countComponents(self, n: int, edges: List[List[int]]) -> int:
root = list(range(n))
def find(i): while i != root[i]: root[i] = root[root[i]] i = root[i]
return i
def union(i, j): x = find(i) y = find(j)
if x == y: return 0 else: root[x] = y return 1
res = n for i, j in edges: res -= union(i, j)
return res This also works. Could anyone explain when rank would be necessary?
🚀 neetcode.io/ - I created a FREE site to make interview prep a lot easier, hope it helps! ❤
where do i find link to your Paytrion?
I learnt about UnionFind from this video. A word that means nothing to me 2 months ago! Thank you for the excellent explanation. You are much better at teaching than my college professors.
Union Find is a cool algorithm but the DFS is so much less code, so less chances of errors and typos. It applies to way more problems too. I was doing number of provinces and the code was way more complicated with union find
Exactly. DFS works just fine.
try 305
Yeah, but I failed an interview that wanted me to do union find and not DFS 😢
agreed
Union Find can be used for Online algorithms while DFS would have a much higher Time Complexity in an online algorithm
Just wanted to say thank you for these videos man. I've been putting off the leetcode grind for a couple of years now and your videos are the clearest on youtube by far and are making learning a breeze! :)
Definitely not the cleanest. He talks too much and struggles to explain advanced topics lucidly.
@@sonicjetson6253 Hey if you are so smart then go start a channel. Why are you even here watching this channel. LMAO.
Some Thoughts:
Use Union-Find when you have a large, sparse graph and need efficient connectivity checks and merges. It is especially suitable for dynamic graph problems where the structure of the graph changes frequently.
Use DFS when you need to traverse the graph, find paths, or need more information about the graph's structure. It is also a good choice for dense graphs and for problems where readability and simplicity are more important.
I find using DFS, VisitedSet and a Counter to be a much more natural way to solve this to me. This is how i approached it below. But it is good to know Union Find algorithm too in case it ever comes in handy.
# Create an adjacency list to represent the graph
graph = {i: [] for i in range(n)}
for edge in edges:
graph[edge[0]].append(edge[1])
graph[edge[1]].append(edge[0])
# A set to keep track of visited nodes
visited = set()
def dfs(node: int):
# Mark the node as visited
visited.add(node)
# Visit all the neighbors
for neighbor in graph[node]:
if neighbor not in visited:
dfs(neighbor)
# Counter for the number of connected components
components = 0
# Iterate through all nodes
for i in range(n):
if i not in visited:
# If a node hasn't been visited, it's a new component
dfs(i)
components += 1
return components
it's also more efficient. Only reason to use union find is if you're dynamically tracking the components in a graph where you are dynamically adding edges, like in the kruskal algorithm
This is exactly how I solved it as well
LC 684 - Redundant Connection is another Union Find Problem on Leetcode; pretty similar to this one.Thanks for clarifying this concept
Thanks for this comment, now I can kill two birds with one stone:D
This is super awesome! I have been using BFS or DFS traversal for pretty much all work-related graph problems so far. I have learned something new today!
Dude which domain do you work in? Just curious to know where you use graphs?
@@jacked-boy oh I'm working on projects that need to deal with 3D meshing, and other geometry related algos. But honestly this is (union find) such a beautiful algo for finding disjoint sets for my problems. Hope this helps!
@@dvsvkimo Got it, thanks for taking out time to respond, really appreciate it 😀. Yeah union find is quite helpful in those stuff!
My initial thought was to pass the parent array to a set and return the length of it. This way I thought we can get all the unique parents and it will match the number of connected components, But your trick to decrement the n on each union is very nice.
Same here, both would work actually as that's the same thought I had initially as well
@@zzzyyyxxx I tried to implement that but it actually doesn't work because some node might not be connected directly to the root parents.
This doesn't work unless you iterate over all "nodes" (1 through n) at the end and compress the paths to make sure every node is at depth 1 from the root in each component. What you can do however is iterate through the nodes and simply check if the parent of that node is equal to the root... this will tell you how many root nodes there are which equals the number of components.
Amazing video with detailed concept illustration!
I found a recursion could be used in find function.
def find(n1):
# find the parent of n1
if n1 == par[n1]:
return n1
# path compression
par[n1] = par[par[n1]]
return find(par[n1])
Just a little bit of change to your code can make it better actually. It can guarantee the height of the tree is no more than 2
def find(n1):
# find the parent of n1
if n1 == par[n1]:
return n1
# path compression
par[n1] = find[par[n1]]
return par[n1]
def find(node):
if par[node] != node:
par[node] = find(par[node]) # Path compression
return par[node]
NIce video - I think the key takeaway for the complexity analysis is that this algo would be running in O(E * V) Time (where V is no of node aka N) if implemented without that par[res] == par[par[res]] line. Since you added that optimization the traversal of the nodes is cut by half leading to a O(E * log(V)) Time which is why Union Find is more optimal. :)
Its O(E+V) in the first method
Rank + Path compression leads to an O(E+V) time complexity
Really?? Wow I wish this was explained in the video as well
Why we need the while loop in the find function? I thought we are always setting the ultimate parent in parent array. For example, in drawing explanation, after union 2, we set par[2] to 0, not 1
I think using recursion instead of the while loop makes it much easier to understand IMO.
Same question.
I was confused too, watching ruclips.net/video/0jNmHPfA_yE/видео.html helped
On union, the algorithm only updates the root node parent of the tree being merged into the larger tree. The while loop is necessary because of that. Imagine connected components 1-2 being merged into connected components 3-4-5. After the union, parent[1] will be 3 but parent[2] will still be 1. The union only updates the parent of the root merging into the larger tree. The path compression limits the depth of that while loop over time by effectively smashing the tree down, but that happens on the find, not on the union, so because of that, the while loop is necessary on the find.
I think we should only change the rank when they are equal and increment it by one, because all the other cases we can still make a tree of max rank by joining to the root of the biggest tree. but if the ranks are equal the biggest tree height is going to increase by one because we are adding an edge.
agree
How can you make everything daunting so easy? Hats off to you sir.
Thanks
Thank you so much Hamza!!!
You don't have to update the rank of p2 when (rank[p2] > rank[p1]) because we are only adding one node at a time, so the length of (root p2 node) is either gonna stay less than it's rank or become equal to the rank.
number of islands II is a wonderful example of why union find is more effective
Love all your videos. God knows when i too can think and write codes as fast
Thanks!
Thank you so much Vikas!!
DFS solution that remains consistent throughout all problems:
class Solution:
def countComponents(self, n: int, edges: List[List[int]]) -> int:
res=0
visit=set()
graph = {i:[] for i in range(n)}
for n1,n2 in edges:
graph[n1].append(n2)
graph[n2].append(n1)
def dfs(i):
if i in visit:
return False
if i in cycle:
return True
cycle.add(i)
T=1
for j in graph[i]:
T = T and dfs(j)
return T
for i in range(n):
cycle=set()
if dfs(i):
res+=1
visit.update(cycle)
return res
Thank you Neetcode!
I don't see why you are using the rank array because it doesn't really matter; once we see that their parents aren't the same we can connect them. I posted my code below as a reference.
class Solution(object):
def countComponents(self, n, edges):
par = [i for i in range(n)]
def find(n):
if n == par[n]:
return n
return find(par[n])
count = n
for n1, n2 in edges:
p1 = find(n1)
p2 = find(n2)
if p1 == p2:
continue
else:
par[p2] = p1
count -= 1
return count
it helps if u want to union the nodes to the bigger node to keep the tree more balanced. also incase a problem asks for size the rank will be helpfull
for 547 ->
class Solution:
def findCircleNum(self, isConnected: List[List[int]]) -> int:
n = len(isConnected)
parent = [i for i in range(n)]
rank = [1] * n
def find(i):
p = parent[i]
while p != parent[p]:
parent[p] = parent[parent[p]]
p = parent[p]
return p
def union(n1, n2):
p1, p2 = find(n1), find(n2)
# cycle found
if p1 == p2:
return 0
if rank[p1] >= rank[p2]:
parent[p2] = p1
rank[p2] += p1
else:
parent[p1] = p2
rank[p1] += p2
return 1
result = n
for i in range(n):
for j in range(i, n):
if isConnected[i][j] == 1:
result -= union(i, j)
return result
Practice for free on geeksforgeeks.org (Note: Problem is named differently).
* "Number of Provinces": practice.geeksforgeeks.org/problems/number-of-provinces/1/
* Theory: www.geeksforgeeks.org/connected-components-in-an-undirected-graph/
Thank you sir! You saved me.
Thanks for sharing this, but this isn't providing the same kind of input, right?
Aghhh, this was so elegant! Thank you!
A HUGE Thanks, it was a crystal clear explanation for me
whats the advantage of union find vs dfs?
Your videos are amazing, thank you for the clear and detailed explanations.
could you please share the spreadsheet link? You mentioned in the video that it would be in the description, but I couldn't find it though.
Sure, here it is: docs.google.com/spreadsheets/d/1A2PaQKcdwO_lwxz9bAnxXnIQayCouZP6d-ENrBz_NXc/edit#gid=0
Thank you very much for this video. I devoured the union find in one go. I usually tend to avoid this:(. Now going to Alien Dictionary. Topological sort 😁
line 10 to 11 in the code, should be replaced with this right? else there's no use of the while loop
temp = par[res]
par[res] = par[par[res]]
res = temp
It's a valid point. Without it, the real path compression won't happen!
Is this solution's complexity O(E * V) time and O(V) space?
Dope! When I first saw the video I was like "too easy, just dfs blah blah ...". Haven't seen anyone solving it using DSU. Thank you sir!
One question though, you don't need rank to solve this problem do you?
Thanks for explaining every super clearly. Make people easy to understand!
Hey bud, could you create a playlist with all 75 problems from that blind post? Thank you so much :)
Its created with 61 problems :)
Would a simple DFS with Visited (set) result in higher time complexity?
i would also like to know this....
DFS is O(E + V), Union-Find is O(E * ack(V)) ≈ O(E). ack() is the ackermann function
@@tsunghan_yu dfs is O(E + V)
@@sairajdas6692 You're right. Thanks
@@tsunghan_yu I think it's actually the "equivalent" of the inverse ackermann function aka alpha I think. Still reduces to O(E) in most cases
I completely got rid of the find function, and just used the parents array. The solution still works. What's going on here?
class Solution:
def countComponents(self, n: int, edges: List[List[int]]) -> int:
par = [i for i in range(n)]
rank = [1] * n
def union(a, b):
p1, p2 = par[a], par[b]
if p1 == p2:
return 0
if rank[p1] > rank[p2]:
par[p2] = p1
rank[p1] += rank[p2]
else:
par[p1] = p2
rank[p2] += rank[p1]
return 1
res = n
for n1, n2 in edges:
res -= union(n1, n2)
return res
@NeetCode At 14:00, why not do rank[p2] += 1?
We want to increase the rank, or size, of the connected component with parent p2 by the rank of the connected component of p1. The connected component of p1 might not always be rank of 1. It might have 2 or 3 or more nodes in the component.
@@JorgeWBush i use rank[p2] += 1 it works fine and got accepted on leetcode.
Dont skip this method , i used the dfs method in an online test but it was not accepted for many test cases because of the time limit exceeded case
you should've mentioned path compression as well as union by size and explain more about the new complexity
How does the path compression not jump to far into the future? `par[res] = par[par[res]]` seems like it should skip to far, but instead since the parent index is also it's value it stays at the same element.
For example, if par[res] = 5 and res = 5 then par[par[par[par[5]]]] == 5 because the 5 element is also 5 so it just keeps looping.
suppose you are looking at
par[5] = par[5] # now you are at the same level
par[5] = par[par5]] # now you are only going one level deep, which is something you are gonna do in the next jump anyway.
What's the time complexity for this?
what's the time and space complexity for this?
Hi , you mentioned the optimization 2 will get connected to 0 directly. but we did not update parent for 2 as 0. won't line 21 be par[p1] = find(p2) ?
Python hashmap only version:
class Solution:
def countComponents(self, n: int, edges: List[List[int]]) -> int:
parent_map = { i: i for i in range(n)}
subtree_size = { i: 1 for i in range(n)}
num_components = n
for starting_node, ending_node in edges:
if self.union(starting_node, ending_node, parent_map, subtree_size):
num_components -= 1
return num_components
def find_root_parent_node(self, current_node, parent_map):
parent_node = parent_map[current_node]
while current_node != parent_node:
grand_parent_node = parent_map[parent_node]
parent_map[current_node] = grand_parent_node
current_node = grand_parent_node
parent_node = parent_map[grand_parent_node]
return parent_node
def union(self, node_1, node_2, parent_map, subtree_size):
node_1_parent = self.find_root_parent_node(node_1, parent_map)
node_2_parent = self.find_root_parent_node(node_2, parent_map)
if node_1_parent == node_2_parent:
return False
smaller_parent = node_1_parent
larger_parent = node_2_parent
if subtree_size[node_1_parent] > subtree_size[node_2_parent]:
smaller_parent = node_2_parent
larger_parent = node_1_parent
parent_map[smaller_parent] = larger_parent
subtree_size[larger_parent] += subtree_size[smaller_parent]
return True
Pro Tip: When you reach a point in your narration where you catch yourself saying something to the effect of, "I didn't mention this before..." you should probably go back and mention it. Case in point: 9:10 12:20
is the time complexity O(e + log n)?
A small note about the rank array: this approach works for keeping track of which node's rank is bigger, but it becomes inaccurate because we need to reset rank of merged nodes to 1. Say, for example, you wanted to know about the sizes of the connected components, then this would become important.
Why not just update the parent once, at the end of the while loop? Since you’re doing this every time, the max rank should always be 2 anyway.
I think line 7 should be res = par[n1] (in the find function) ? thanks.
Dodged the union find in this video using DFS, coming back after few days for another problems that needed union find xDDD
So yeah, don't skip any new algorithm you saw!
the ranking part confused me of why it is being used. When performing the union, why does it matter if you set the parent of the of the smaller ranking to the bigger one? since the result is just the number of nodes minus unions you can just get rid of the entire ranking system correct? LC accepted it. Am i missing something?
Using your code and removing ranking it would look like this:
def countComponents(self, n, edges):
par = [i for i in range(n)]
def find(n1):
res = n1
while res != par[res]:
# path compression
par[res] = par[par[res]]
res = par[res]
return res
def union(n1, n2):
p1, p2 = find(n1), find(n2)
if p1 == p2:
return 0
else:
par[p1] = p2
return 1
res = n
for n1, n2 in edges:
res -= union(n1, n2)
return res
Minimize Tree Height: The primary purpose of using rank is to minimize the height of the trees that represent the sets. By always attaching the smaller tree to the root of the larger tree, the height of the tree increases only when two trees of the same height are united, and even then, the increase is by just one. This helps in keeping the trees as flat as possible.
Optimize Find Operation: The efficiency of the find operation depends on the height of the tree because the operation requires traversing from a node up to the root of the tree. By minimizing the height of the trees, the find operation becomes faster, as it needs to traverse fewer levels.
Please make a video on Accounts Merge Problem !
This solution isn't working as of October 7th 2024. Passed 18/19 tests, not sure why. Debugging the last test case is a pain.
Apparently, you use union by size not union by rank. Is it right? Using rank term confused me actually
Won't solving this using DFS or BFS be easier?
I totally think the same here. imo union find is more tricky to catch. for example :
from collections import defaultdict
class Solution:
def countComponents(self, n: int, edges: List[List[int]]) -> int:
count = 0
seen = [False] * n
graph = defaultdict(list)
for a,b in edges:
graph[a].append(b)
graph[b].append(a)
def dfs(i):
seen[i] = True
for node in graph[i]:
if not seen[node]:
dfs(node)
for i in range(n):
if not seen[i]:
count+=1
dfs(i)
return count
don't get me wrong, I love those videos. I learn so much for them, and I love the fact that it gives another opportunity to learn union find. But i was just meaning that it might be more trivial to understand the alternative imo. Thanks again for those amazing videos Mr Needcode.
@@doronbaruch665 union find is faster right?
as union find would be o(logn) compared to o(n) of dfs solution
@@doronbaruch665 hey thank you for the dfs code. May I know why you used a defaultdict instead of a ordinary list? I was initially coding like this, I couldn't figure out why it returns what it returns and why everyone else is using a defaultdict. Thank you in advance!
n = 5
edges = [[0,1],[1,2],[3,4]]
adjList = [[]] * n
for i in range(len(edges)):
adjList[edges[i][0]].append(edges[i][1])
adjList[edges[i][1]].append(edges[i][0])
print(adjList)
@@fa-rh1ux Union find is not faster, it's O(e+v) like neetcode said in the video. The dfs is also O(e+v) since you only visit each edge and node once.
Hey great solution, but still not the got the reason why is it necessary to the grandparent.?????
And also please dry run the code once you write it as it gets easier for us to understand.
if ur talking about the path compression, its to reduce the depth of the tree which reduces time complexity
I suspect that this code has a bug. If 1-9 connects and 3-4-5 connects. Then an edge 1-4 comes, then 3's parent will remain 3 but 3's parent should have been 1. Is that right?
For the purpose of just calculating connected components, it really doesnt matter but for problems where actual parents matter, Union find isnt the right approach.
rank[p2] += rank[p1] this will work but according to concept of union find it is wrong. p1 and p2 is like subtree and when we are combining these two tree, we need to make root and that will increase the rank by 1 not addition.
rank[p2]++;
When speeding up the find function with parent[p] = parent[parent[p]], why does this line do nothing if the grandparent doesn't exist?
I would expect it to return an error.
The grandparent will always exist, because even if it's not pointing to a different index (node), the grandparent of p will be the parent of p-as a sort of circular pointer. This is similar to how, in the beginning of the code, we set each parent to be itself, in the line `par = [i for i in range(n)]`.
What helped me understand this is to think of parents as pointers to nodes, instead of nodes themselves.
Watching/solving LeetCode 287 "Find the Duplicate Number" is helpful in understanding this, as it similarly treats the indexes of an array as though they were nodes in a graph.
## BFS and DFS solution comment either one to use
class Solution:
def countComponents(self, n: int, edges: List[List[int]]) -> int:
visited=set()
q=collections.deque()
count=0
hmap=collections.defaultdict(list)
for u,v in edges:
hmap[u].append(v)
hmap[v].append(u)
for node in range(n):
if node not in visited:
##BFS or DFS
count=count+1
self.bfs(visited,q,node,hmap)
self.dfs(visited,node,hmap)
return count
def bfs(self,visited,q,node,hmap):
q.append(node)
visited.add(node)
while(len(q)):
curr=q.popleft()
for child in hmap[curr]:
if child not in visited:
q.append(child)
visited.add(child)
def dfs(self,visited,node,hmap):
## Base
if node in visited:
return
visited.add(node)
for child in hmap[node]:
self.dfs(visited,child,hmap)
I just blew out today the Microsoft interview because of this damn problem. I did not see it before... :(
Did you let you use DFS or they said you had to use union find?
can you do a video on tarjan's algorithm in python?
I couldn't understand find. Without find it works fine
class Solution:
def countComponents(self, n: int, edges: List[List[int]]) -> int:
par = [i for i in range(n)]
rank = [1]*n
def find(node1, node2):
if par[node1] == par[node2]:
return 0
else:
if rank[node1] >= rank[node2]:
par[node2] = par[node1]
rank[node1] += rank[node2]
else:
par[node1] = par[node2]
rank[node2] += rank[node1]
return 1
res = n
for node1, node2 in edges:
res -= find(node1, node2)
return res
Why can't you run dfs on every node that has not been visited?
you definitely can!
this question does not open on leetcode now, it asks for premium subscription.
Or We can simply DFS,and everytime we DFS we increment a variable cnt,then output cnt-1.
The code about rank/size is misleading,
- rank should be += 1 each time when rank are the same as it tracks the depth of the tree (init to 0s)
- size on the other hand tracks total node connected (init to 1s)
great explanation!
Can anyone explain why he does rank[p2] += rank[p1] or vice versa? I intuitively did rank[p1]+=1 and same for p2, and it still works.
This question is similar to the question 547. Number of Provinces
Awesome video. Thank you
This question became paid on leetscode :P
@NeetCode there is a question similar to this, its leetcode 547 number of provinces
He said that in the video
would this be O(nlogn) time?
Why is number n - (num of unions) the answer?
Great video
How would this look like for Number of Provinces?
no changes, literally. Each province is nothing but a connected component :)
def find(x):
if par[x] != x:
return find(par[x])
else:
return x
Could someone share the dfs method?
its a premium question anyone know a similar question on some other platform?
video is pretty awesome
Actually Union Find is not an algorithm, many people get mistaken about it. It is a Data Structure.
It’s algorithm, Disjoint Set is data structure
What’s the complexity of this algorithm?
Can we do an union find on a connected graph?
Love this vid
Adjust our heads by 90 degrees??? It’s probably be easier to just provide the rendering to us older more vulnerable people
Thanks for making this video. It seems we don't need to consider the rank of components in this problem:
class Solution:
def countComponents(self, n: int, edges: List[List[int]]) -> int:
root = list(range(n))
def find(i):
while i != root[i]:
root[i] = root[root[i]]
i = root[i]
return i
def union(i, j):
x = find(i)
y = find(j)
if x == y:
return 0
else:
root[x] = y
return 1
res = n
for i, j in edges:
res -= union(i, j)
return res
This also works. Could anyone explain when rank would be necessary?
DFS approach is much simpler.
can someone tell me whats wrong in this code? #include
#include
#include
using namespace std;
void dfs(int currentNode, unordered_map< int, vector >& adjList, vector& visited){
visited[currentNode] = 1;
vector currentNodeNeighbors = adjList[currentNode];
for(int neighbor: currentNodeNeighbors){
if(!visited[neighbor]){
dfs(neighbor, adjList, visited);
}
}
}
int main(){
int n, e;
cin >> n >> e;
unordered_map< int, vector > adjList;
for(int i = 0; i < e; i++){
int n1, n2;
cin >> n1 >> n2;
adjList[n1].push_back(n2);
adjList[n2].push_back(n1);
}
vector visited(n + 1, 0);
int count = 0;
for(auto it = adjList.begin(); it != adjList.end(); it++){
int currentNode = it -> first;
if(!visited[currentNode]){
count++;
dfs(currentNode, adjList, visited);
}
}
cout
Hero!
this is hard... and I'd done this a year ago lol... why is it hard now all of a sudden when I want to switch my job lol
Increadible
can you please do, Add Two Numbers II | Leet code 445
Yes, pls, especially the one do not reverse the list
There is a prerequisite, please first do how to print any number then we ll learn addition. :)
why dun also upload your code? (cry)
thanks :)
No exemple? :(
C++ Solution:
class Solution {
public:
int uFind(int n, vector &parents) {
int curr = n;
while (curr != parents[curr]) {
parents[curr] = parents[parents[curr]];
curr = parents[curr];
}
return curr;
}
int uUnion(int n1, int n2, vector &parents, vector &rank) {
int parentN1 = uFind(n1, parents);
int parentN2 = uFind(n2, parents);
if (parentN1 == parentN2) {
return 0;
}
int newRoot = min(parentN1, parentN2);
int newChild = max(parentN1, parentN2);
parents[newChild] = newRoot;
rank[newRoot] += rank[newChild];
return 1;
}
int countComponents(int n, vector& edges) {
vector parents(n);
for (int i = 0; i < n; i++)
parents[i] = i;
vector rank(n, 1);
int result = n;
for (auto &edge : edges) {
result -= uUnion(edge[0], edge[1], parents, rank);
}
return result;
}
};
Union Find complexity is the inverse Ackermann function?
You should talk more clearly about complexity in all your videos.
For some reason I hate that union find works here.
where can i do this problem for free?
Reti Sociali