I just got this question during my interview with Uber. Unfortunately, I couldn't solve it and the interviewer told me in advance that I didn't pass, but encouraged me to try again in 6 months.
posting this just in case it helps anyone like me who struggled with the intuition of why we need post-order (instead of pre-order) traversal. so I kept thinking, "either way it guarantees that the current char will end up in front of all its children, so what the heck is the difference?". the magic moment was when I realized post-order is like doing everything from the _back_ of the line, so any of its parents we haven't processed yet are still free to end up in front of it. pre-order on the other hand doesn't leave any room for them, since it already starts from the front for some reason imagining people going to the back of an imaginary line helped
Done thanks 6:20 you only need to compare words to the word right after it and not to every other word in the list because the list itself is sorted lexicographically according to the language
For those looking for the solution that also works in Lintcode, when iterating over the characters to call DFS on each one, we can replace: `for char in adj:` with: `for char in reversed(sorted(adjacencyMap.keys())):` This is because our solution already works for all connected components, but when we have 2 connected components, we want to ensure that we sort the start of our connected components by Human dictionary order (because Lintcode requires it). The reason we're sorting in reverse order, ex: z before a, is that we're going to reverse the result later on. Was looking for a while on how to convert Neetcode's solution to work with Lintcode's requirements and couldn't find it anywhere so figured I'd share the solution once I got it working.
Found that the dfs part is very similar to the Course Schedule II solution. So modified it in the same way. The visited False and True is slightly confusing for me and difficult to remember. class Solution: def foreignDictionary(self, words: List[str]) -> str: adj = {c: set() for w in words for c in w} for i in range(len(words) - 1): w1, w2 = words[i], words[i + 1] minLen = min(len(w1), len(w2)) if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]: return "" for j in range(minLen): if w1[j] != w2[j]: adj[w1[j]].add(w2[j]) break visited = set() path = set() res = [] def dfs(c): if c in visited: return True if c in path: return False path.add(c) for nei in adj[c]: if not dfs(nei): return False path.remove(c) visited.add(c) res.append(c) return True for c in adj: if not dfs(c): return "" return "".join(res[::-1])
Awesome stuff! Your videos are the reason I pass any interview at all. Just another optimization, instead of reversing you can also store the result in a deque and insert at the front. It saves one traversal.
In Python, inserting an element at the front of a standard list (which behaves like an array) has a time complexity of O(n), where n is the number of elements currently in the list. Here's why: Python lists are implemented using arrays. Inserting at the front requires shifting all existing elements one position to the right to make space for the new element. Shifting n elements takes about n operations
Thank you so much for such beautiful explanations....just wanted to point out that probably dfs function's last line should be return false coz it didn't find the cycle (though in python it won't matter as default will be None) but worth mentioning.Please correct me if I am wrong. Thanks...you are awesome!!!
I couldnt solve it on leetcode for its worst explanation, then i find out your video watch your explanation after that it dosent seems that much hard now. Thanks neetcode keep up the good work
Here is a working JS solution for the lintcode problem in JavaScript.. it wasn't easy! /** * @param words: a list of words * @return: a string which is correct order */ alienOrder(words) { const adj = {}; const visiting = {}; const result = []; for (const word of words) { for (const letter of word) adj[letter] = adj[letter] || new Set(); } for (let i = 1; i < words.length; i++) { const word = words[i-1]; const nextWord = words[i]; if (word.startsWith(nextWord)) return '';
for (let j = 0; j < nextWord.length; j++) { const first = word[j]; const last = nextWord[j]; if (first !== last) { if (first !== undefined) adj[first].add(last); break; } } } for (const char of Object.keys(adj).sort().reverse()) { if (dfs(char)) return '' } return result.reverse().join(''); function dfs(char) { // Return true if we have an invalid graph, cycle if (visiting[char]) return true; // Cycle detected if (char in visiting) return; // Already visited visiting[char] = true; for (const successor of adj[char]) { if (dfs(successor)) return true; } visiting[char] = false; result.push(char); } }
@@MichaelShingo class Solution: def alien_order(self, words: list[str]) -> str: adj = { c:set() for w in words for c in w } for i in range(len(words) - 1): w1, w2 = words[i], words[i + 1] minLen = min(len(w1), len(w2)) if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]: return "" for j in range(minLen): if w1[j] != w2[j]: adj[w1[j]].add(w2[j]) break visit = {} # False = it is visited, True = it is in the current path res = [] def dfs(c): if c in visit: return visit[c]
visit[c] = True for nei in adj[c]: if dfs(nei): return True visit[c] = False res.append(c)
# for c in adj: # if dfs(c): # return ""
for c in sorted([c for c in adj.keys()], reverse=True): if dfs(c): return '' res.reverse() return "".join(res) if __name__ == "__main__": obj = Solution() words1 = ["wrt","wrf","er","ett","rftt"] print(obj.alien_order(words = words1)) words2 = ["z","x"] print(obj.alien_order(words = words2)) words3 = ["z","o"] print(obj.alien_order(words = words3)) words4 = ["hrn","hrf","er","enn","rfnn"] print(obj.alien_order(words = words4)) word5 = ["ab","adc"] print(obj.alien_order(words = word5))
Been following Neetcode blind 75 playlist and Solved all the questions today. This was my last question. Looking forward to solve neetcode 150. I am from tier 3 college and tier 3 branch of IT from India. Resources you provided give me hope that I could crack a job interview. Just wanted to write a quick thank you. And show you how your efforts are having an impact on students coming from a small village in India.
If your code fails to pass Leetcode testcases just add these two if statements before building adj dictionary: if not words: return "" if len(set(words)) == 1: return "".join(set(c for c in words[0])) This is needed due to test cases like ["z", "z"], or ["aba"]
Good solution. But I think it may fail in test cases like ["ab","abc"] in LintCode where we are expected to return abc as the solution, but as per this logic we are only concerned about ordering in w1 and w2 and hence this solution considers cba as the right answer. Need more clarity on this part.
actually no , there is no order could tell in this test case. Let say we have ["can","cancel"] in English, but instead of "canel" , We all know "aceln" is the correct order in English. But we wouldn't know it if we can't compare in Alien Language. So any order could be the answer. I hope this help your cunfuse.
The question on lint code is slightly different from the question on leetcode. Leetcode says return in any order if multiple solutions exist. Lint code asks to return in human lexicographical order if multiple solutions exist.
Quick question: why do you only have to compare every word to the word to its right? Wouldn't there be additional relationships between the first and the fifth word for example. How can you know for sure you aren't missing any relationships by just looking at every adjacent pair of words?
Since we know the words are already in sorted order. A simple example would be [a, b, c]. Here we know b comes after a (a -> b) and that c comes after b (b -> c). This tells us a -> b -> c. So it makes no sense to compare a and c.
I was also confused with the same doubt. Consider these test cases: a, bc, bd, ba a, bc, bd, bde, ed So, what I understood, if we try to create a test case to break the above assumption, either we end up creating a cycle or the order is preserved and maybe there are some extra edges added if we do for all the pairs.
LincCode further specifies when it's a tie: 4. There may be multiple valid order of letters, return the smallest in normal lexicographical order. 5. The letters in one string are of the same rank by default and are sorted in Human dictionary order.
I did the lintcode version. I started with the strategy used in in CourseSchedule II. I also used two sets putting all the letters in both (noPre, noPost). Then I removed elements from noPre if a letter had a preceding letter and removed from noPost if a letter had post letter. I combined the noPre and noPost to form a startOrEnd set. Think of it as ends of a rope\thread. I created a list from the startOrEnd set. Sorted the list in lexicographical order. Then did dfs for each letter in the startOrEnd list. I deleted an element from the adjacency list when all it's preceding elements had been processed successfully. If the adjacency list was not empty after processing everything in the startOrEnd list then the input had a loop and a blank string was returned. There was a loop check in the dfs as well.
If you're doing this on Lintcode, there's an extra requirement to return the soln in English lex order if there is more than one soln. to that end, just run the dfs on the keys in the adjacency in reverse... in a list (since we didn't use ordereddict) for c in sorted([c for c in adj.keys()], reverse=True): if dfs(c): return '' adj is my adjacency dict. then everything else can stay the same (including the reverse sort before return)
Instead of doing a reverse order, what if we change the direction of edges in our graph? Like similar to course schedule, adding the graph[w2[j]] = w1[j] kind of meaning like a dependency? Then we would be building it in correct order with post order DFS?
yes, I was thinking of the same. It makes more sense to say B->A where A is lexicographically smaller. This way, the post-order DFS seems very intuitive without the need for any reversing.
Also, your solution needs a small fix to work for all test cases in leetcode. The loop through in line 29, should be for all unique characters in the 'words' parameter and not all keys in the 'adj' dict. This modification handles scenarios like when input is ['z', 'z'].
Just got asked a very similar question in an interview, didn't remember the topo sort at all but somehow managed to do it using a dictionary with letter : unique_letters_after and then 1. Getting out a letter which doesn't appear in the dict values (ie. doesn't appear after any other letter) 2. Removing it from the dictionary 3. Repeat until dict is empty, and at this point you just have to append the last letter, which is a letter that doesn't appear in the dictionary keys
Great solution! I think the difficult part is to understand the problem and build up the adj list. After that the problem basically Course Schedule I or Course Schedule II (topological sort or graph coloring will works) :). Thanks for the explaination
Please update the c++ code for this problem on neetcode website. Outer for loop from inDegree initialisation is misplaced. Also neetcode compiler is giving, g++ internal compiler out of space error. Thank you. You are doing god's work. 🙏🏻
For those wondering how does it click to do post-order DFS in interviews, if you were to approach this with Kahn's BFS algorithm, you wouldn't have to worry about post-oder DFS.
Why didnt we check if the character is already in the result as a base case for dfs alongside the visited check? Normally we do topological sort like this.
Thanks for sharing this great approach. I have a question regarding this problem. If we pass in this input, ["zx","zy"], the expected output is "xyz". My question is, how are we able to determine that z comes after x and y? Thinking about an example alphabetically it can be either before or after which is why I think this input should ideally return invalid as shown below. Let me know your thoughts. Thanks again. ["ad", "af"] --> adf ["zd", "zf] --> dfz
You can't determine it. In this case, it's ambiguous where z should be placed and is a sort of "free" character. This is separate from a contradiction (cycle), so you can place z wherever as long as the relative ordering of all "non-free" characters is preserved in your final answer. In this example, there are multiple possible answers: "xyz", "xzy", and "zxy". Any of them work.
@NeetCode, can you check updated description for this problem, given the description isn't it too hard to be asked in an interview, but has been asked a lot of times
What about this test case ? {"bac", "bad", "baefgh"} graph will look like : c -> d -> e and the topo sort is : c, d, e But where is f, g, h in the order ???? HELP PLEASE !
I was confused because why didn't we use this method for the course schedule topological sort? we cleared the array once we are done with that. Can we use the same technique here?
Excellent video, thank you. One question: list like "abc", "ab" is not sorted lexicographically, right? The premise is "You are given a list of strings words from the alien language's dictionary, where the strings in words are sorted lexicographically by the rules of this new language." So, the input must be sorted lexicographically. List like "abc", "ab" should not appear in input. Why should we check "if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]:" ? This should never happen, right?
For anyone else that comes across this: I believe the answer is because the absence of a character is considered lexicographically *lower* than the presence of any character. If `len(w1) > len(w2) and w1[:minLen] == w2[:minLen]` then the first word is larger than the second AND the second word is a substring of the first word, yielding an invalid input and no solution (see 4:10).
When we construct a DAG from the input, we may end up with several independent connected components, where each connected component is a single vertex, or a directed sequence of vertices. Within each connected component, for any directed edge, the source vertex is smaller than the sink vertex. However, between the vertices of two connected components, there's no lexicographic relationship, since there isn't an edge between any vertex in either of those two components. In that case, the ordering between vertices of those two components are the same as normal 'human' ordering. This means that when we do the topological sort on keys of the DAG (if you represent the adjacency list as a Map), we need to consider vertices in the 'human' ordering. This ensures that the final result is composed of vertices that, WITHIN their connected components, are ordered in the 'alien' way, but are ordered in the 'human' way ACROSS connected components.
The LeetCode problem states that you can return the ANY valid solution. So you don't need 'human' order across connected components. I see some people are talking about Lintcode having a different problem statement. If that's what you're referring to then you should say so.
How does this work for case ["a","ab"]? This creates and adj list of {a:{} , b:{}} I might be missing something but looping through each node in adj list without some sort of order could have the result return an answer of [a,b] or [b,a].
@@capooti Aren’t “cbda” and “abcd” both valid results for the input [“ab”, “adc”]. The only information [“ab”, “adc”] provides is ‘b’ is somewhere before ‘d’ in the alphabet. Therefore, any result that places ‘b’ somewhere before ‘d’ (and contains all characters found in the provided words) is correct.
since lintcode is adding another condition which is if there is multiple valid ordering of the letters then you should return the order with the smallest in lexographical order, i am not sure exactly how to implement it but i think if you were able to force the dfs to run on the letters backward it will work somehow
you could use these lines instead of the lines written at 29 and it will return the values in normal lexographical order as stated in lintcode, i tried it out for c in sorted(adj.keys(),reverse=True): if dfs(c): return ""
You said we are looping through all the characters because we are doing this in reverse order. I understand how we are doing that as explained in the video (using dfs with post order). The algo works, but I am not understanding how since we loop through all the characters how are we not appending repeated characters?
Just found about your channel, your work is awesome 🔥🔥🔥🔥. Can you make some new playlists on the basis of questions asked by each companies. Like all these were asked by Facebook or microsoft
Bro this was the answer I was keep looking for, I'm used to implement topological sort using post order BFS and there are bunch of soln out there which use some other technique like inorder to implement topological sort which I don't know. Thanks Man!
Why is there a need for visited to have a false or true value? Can't we just check if the key exists and return false? What's the difference between visited and in path?
the nodes in `visited` are nodes that have ever been touched/visited. The nodes in `path` are nodes that are currently being visited in the recursion stack path.
why would we ever get an input such as ["wrtkj","wrt"] when the input words is meant to be sorted lexicographically. hence I'm unsure why we need: if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]: return "" can someone pls explain. thanks.
Watching the video, I had a question, I was wondering what if one of the strings is bigger and all the characters are the same except the last one of the bigger string, how will we compare them?
why do we only have to compare "adjacent words" when finding dependencies? why cant we make the same dependencies from 1st and 3rd node then 1st and 4th node and so on...
You can, but it's redundant. Consider the input ["g", "f", "c"] We know that there's an edge from 'g' -> 'f' and an edge 'f' -> 'c'. Due to their ordering in the input, we know that 'g' < 'c'. But if we add an edge 'g' -> 'c' as well, it'll make no difference since we'll still have to visit 'f' before we visit 'c'. Hence, only comparing adjacent words is more efficient and prevents the runtime from going to quadratic in the number of words as opposed to linear.
anyone have the complexity of this ? I think it's O(N*M) + O(K+E) with: N : number of words M : max length of words K : number of node E : number of edge
Great video but I prefer using Khan's algorithm for this. Its easier to understand and its pretty similar to the solutions you show in course scheduler I & II. Cool that this way to solve it exists though.
for those who are finding it difficult due to use of visited dictionary: def alienOrder(self, words: List[str]) -> str: graph = {chr: set() for word in words for chr in word} for i in range(len(words) - 1): w1 , w2 = words[i], words[i+1] minLen = min(len(w1), len(w2)) if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]: return "" for j in range(minLen): if w1[j] != w2[j]: graph[w1[j]].add(w2[j]) break print(graph) output = [] stack = [] def dfs(vertex): if vertex in stack: return False if vertex in output: return True stack.append(vertex) for neighbor in graph[vertex]: if not dfs(neighbor): return False stack.pop() output.append(vertex) return True for chr in graph: if not dfs(chr): return "" return "".join(output[::-1]) it uses stack if it has been popped from stack at the end when all neighbors are visited then it wont exist in that anymore so we cant detect cycle ..i find this much easy...hats off to this guy with the implementation of adjlist and understanding this problem and then thinking this could be solved by topological sorting method with a check of detecting cycle...this is insane bro....
So that when we call dfs on its neighbors we can detect if a cycle is found by visiting this node again. Once we have checked all the neighbors we know we havent found a cycle so we can set it back to false to remove it from the path of nodes we are checking
neetcode I love your videos and solutions but for this one i gotta say i think the Leetcode Premium solution using Kahn's Algorithm is far more intuitive. It just makes more sense for me
@NeetCode, hello. Thanks for the sharing. I got one more question. The problem said "If there are multiple solutions, return any of them", what about return only the one as the earth english order like abcdefg...? what should we need to do?
U an Alien God. Setting the visit[c] = True and then to visit[c] = False, seems like a backtracking situation. But it isn't. Since our visit[c] was not set to False initially.
the explanation is wrong, for example for the test case ["wrt","wrf"] the expected output is "rtfw" This means that only the first character and the length matter and the other chars do not, in the explanation he traverses the string like the other characters matter. I think this video is deprecated.
In the test case you provide, the only character of difference is the last one. So given the ordering, you can use any solution that has "t" before "f". For example, "rwtf" and "tfrw" will also pass that test case. It is a problem that may have multiple solutions like so.
@@avenged7ex With this implementation, for test case ["a", "ab"] the result will return "ba" after it is reversed. Ran it in leetcode as a test case and "ba" is still accepted as an answer which is strange as I would think it would need to be "ab"
Looked at the LC discussion and realized I've misunderstood the description. For those reading this and wondering the same question, the description said the words are sorted lexicographically, not the individual letters. In this example ["abs", "add"], the word "add" does not imply the ordering a < d.
@@Allan-ti5uu Brilliant! Thanks, what a twisted question. Now coming to it, do Facebook engineers expect someone to solve it if they've never done this before in their lives XD
for the prefix check, is leetcode the same as lintcode where: The dictionary is invalid, if string a is prefix of string b and b is appear before a. and can string a and b appear anywhere in words i.e. [a, ........, b] so a pairwise check for len(w1) > len(w2) and w1[:minLen] == w2[:minLen] is insufficient as you need to check every word with every other?
You can fix this by making following modification: - In line 29, loop through all unique characters found in the 'words' list all_chars = set() for word in words: for char in word: all_chars.add(char) for char in all_chars: if dfs(char): return ""
bfs is actually more intuitive and cleaner: ``` class Solution: def alienOrder(self, words: List[str]) -> str: adj_list = collections.defaultdict(set) in_degree = Counter({c:0 for word in words for c in word}) for i in range(len(words)-1): word = words[i] next_word = words[i+1] for char1, char2 in zip(word, next_word): if char1 != char2: if char2 not in adj_list[char1]: adj_list[char1].add(char2) in_degree[char2] += 1 break else: if len(next_word) < len(word): return "" queue = collections.deque([c for c in in_degree if in_degree[c] == 0]) output = [] while queue: char = queue.popleft() output.append(char) for d in adj_list[char]: in_degree[d] -= 1 if in_degree[d] == 0: queue.append(d) if len(output) < len(in_degree): return "" return "".join(output) ```
🚀 neetcode.io/ - I created a FREE site to make interview prep a lot easier, hope it helps! ❤
wish the python solutions you just wrote would be on your site as well (:
Definitely it is helping me, Thanks a lot man!
😊
I can't wait when someday my client will request to implement this algorithm so they can sell the product to Alien.
Man if I get this question during an interview, ima just start looking for another interview.
for real. first of all, this level of difficulty is just unnecessary
Unfortunately it's very common in interviews especially at Airbnb
This is a catch, if you manage to solve this in interview, you will be put almost ahead of many other candidates. If you can't its just a bad day :)
I just got this question during my interview with Uber. Unfortunately, I couldn't solve it and the interviewer told me in advance that I didn't pass, but encouraged me to try again in 6 months.
You could also build the dependency in reverse order so you don't have do the reverse at the end. E.g., rather than a->c, do c->a
Yeah, absolutely
and we are getting O(n) time complexity for the reverse function, so we can optimize it by building adjacency in the opposite order
or we can use deque and instead of res.append(c) we do res.appendleft(c)
The postorder explanation was really mind blowing. I thought you would do it the traditional way by keeping incoming edge counts.
Solved this on my own. I don't know if I should celebrate such a small achievement or not but hey, I am improving. Thanks NC, following your 150 sheet
Hi, congrats. Did you finish the NC150 before this? Do you rcm any books?
one of the best part of your video is reading question. It is much clear after your explanation!
posting this just in case it helps anyone like me who struggled with the intuition of why we need post-order (instead of pre-order) traversal. so I kept thinking, "either way it guarantees that the current char will end up in front of all its children, so what the heck is the difference?". the magic moment was when I realized post-order is like doing everything from the _back_ of the line, so any of its parents we haven't processed yet are still free to end up in front of it. pre-order on the other hand doesn't leave any room for them, since it already starts from the front
for some reason imagining people going to the back of an imaginary line helped
Done thanks
6:20 you only need to compare words to the word right after it and not to every other word in the list because the list itself is sorted lexicographically according to the language
For those looking for the solution that also works in Lintcode, when iterating over the characters to call DFS on each one, we can replace:
`for char in adj:`
with:
`for char in reversed(sorted(adjacencyMap.keys())):`
This is because our solution already works for all connected components, but when we have 2 connected components, we want to ensure that we sort the start of our connected components by Human dictionary order (because Lintcode requires it). The reason we're sorting in reverse order, ex: z before a, is that we're going to reverse the result later on. Was looking for a while on how to convert Neetcode's solution to work with Lintcode's requirements and couldn't find it anywhere so figured I'd share the solution once I got it working.
Amazing, this works thank you
thanks
Mine still doesnt Work. I dont know why!
Got asked this in a Facebook interview. Needless to say I failed
Right on, traveler.
Damn. You were passing interview to Senior SDE position, I guess?
@@StanisLoveSidno lol, this was for an intern spot
@@rodgerdodger17 bloody hell
@@StanisLoveSid It works the other way around. Senior positions need strong System Design and Behavioral.
Found that the dfs part is very similar to the Course Schedule II solution. So modified it in the same way. The visited False and True is slightly confusing for me and difficult to remember.
class Solution:
def foreignDictionary(self, words: List[str]) -> str:
adj = {c: set() for w in words for c in w}
for i in range(len(words) - 1):
w1, w2 = words[i], words[i + 1]
minLen = min(len(w1), len(w2))
if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]:
return ""
for j in range(minLen):
if w1[j] != w2[j]:
adj[w1[j]].add(w2[j])
break
visited = set()
path = set()
res = []
def dfs(c):
if c in visited:
return True
if c in path:
return False
path.add(c)
for nei in adj[c]:
if not dfs(nei):
return False
path.remove(c)
visited.add(c)
res.append(c)
return True
for c in adj:
if not dfs(c):
return ""
return "".join(res[::-1])
Awesome stuff! Your videos are the reason I pass any interview at all.
Just another optimization, instead of reversing you can also store the result in a deque and insert at the front. It saves one traversal.
In Python, inserting an element at the front of a standard list (which behaves like an array) has a time complexity of O(n), where n is the number of elements currently in the list.
Here's why:
Python lists are implemented using arrays.
Inserting at the front requires shifting all existing elements one position to the right to make space for the new element.
Shifting n elements takes about n operations
that's why a deque was suggested: inserting is O(1) at front
Kahn's algorithm can be implemented to find Topological order in a DAG.
yes, kahn;s method is simple to understand
any examples sir, please.
any examples sir, please.
any examples sir, please.
any example sir, please.
Thank you so much for such beautiful explanations....just wanted to point out that probably dfs function's last line should be return false coz it didn't find the cycle (though in python it won't matter as default will be None) but worth mentioning.Please correct me if I am wrong.
Thanks...you are awesome!!!
U are right.
I couldnt solve it on leetcode for its worst explanation, then i find out your video watch your explanation after that it dosent seems that much hard now. Thanks neetcode keep up the good work
I think the lintcode version of this question is different and it's failing tests. Eg:
Input
["ab","adc"]
Output
"cbda"
Expected
"abcd"
Here is a working JS solution for the lintcode problem in JavaScript.. it wasn't easy!
/**
* @param words: a list of words
* @return: a string which is correct order
*/
alienOrder(words) {
const adj = {};
const visiting = {};
const result = [];
for (const word of words) {
for (const letter of word) adj[letter] = adj[letter] || new Set();
}
for (let i = 1; i < words.length; i++) {
const word = words[i-1];
const nextWord = words[i];
if (word.startsWith(nextWord)) return '';
for (let j = 0; j < nextWord.length; j++) {
const first = word[j];
const last = nextWord[j];
if (first !== last) {
if (first !== undefined) adj[first].add(last);
break;
}
}
}
for (const char of Object.keys(adj).sort().reverse()) {
if (dfs(char)) return ''
}
return result.reverse().join('');
function dfs(char) { // Return true if we have an invalid graph, cycle
if (visiting[char]) return true; // Cycle detected
if (char in visiting) return; // Already visited
visiting[char] = true;
for (const successor of adj[char]) {
if (dfs(successor)) return true;
}
visiting[char] = false;
result.push(char);
}
}
yes, how do you find the right one when there's multiple possibilities?
@@MichaelShingo
class Solution:
def alien_order(self, words: list[str]) -> str:
adj = { c:set() for w in words for c in w }
for i in range(len(words) - 1):
w1, w2 = words[i], words[i + 1]
minLen = min(len(w1), len(w2))
if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]:
return ""
for j in range(minLen):
if w1[j] != w2[j]:
adj[w1[j]].add(w2[j])
break
visit = {} # False = it is visited, True = it is in the current path
res = []
def dfs(c):
if c in visit:
return visit[c]
visit[c] = True
for nei in adj[c]:
if dfs(nei):
return True
visit[c] = False
res.append(c)
# for c in adj:
# if dfs(c):
# return ""
for c in sorted([c for c in adj.keys()], reverse=True):
if dfs(c): return ''
res.reverse()
return "".join(res)
if __name__ == "__main__":
obj = Solution()
words1 = ["wrt","wrf","er","ett","rftt"]
print(obj.alien_order(words = words1))
words2 = ["z","x"]
print(obj.alien_order(words = words2))
words3 = ["z","o"]
print(obj.alien_order(words = words3))
words4 = ["hrn","hrf","er","enn","rfnn"]
print(obj.alien_order(words = words4))
word5 = ["ab","adc"]
print(obj.alien_order(words = word5))
Been following Neetcode blind 75 playlist and Solved all the questions today. This was my last question. Looking forward to solve neetcode 150.
I am from tier 3 college and tier 3 branch of IT from India. Resources you provided give me hope that I could crack a job interview.
Just wanted to write a quick thank you. And show you how your efforts are having an impact on students coming from a small village in India.
If your code fails to pass Leetcode testcases just add these two if statements before building adj dictionary:
if not words: return ""
if len(set(words)) == 1:
return "".join(set(c for c in words[0]))
This is needed due to test cases like ["z", "z"], or ["aba"]
Cases will still fail for inputs like "ab","abc"
Good solution. But I think it may fail in test cases like ["ab","abc"] in LintCode where we are expected to return abc as the solution, but as per this logic we are only concerned about ordering in w1 and w2 and hence this solution considers cba as the right answer. Need more clarity on this part.
actually no , there is no order could tell in this test case. Let say we have ["can","cancel"] in English, but instead of "canel" , We all know "aceln" is the correct order in English. But we wouldn't know it if we can't compare in Alien Language. So any order could be the answer. I hope this help your cunfuse.
The question on lint code is slightly different from the question on leetcode. Leetcode says return in any order if multiple solutions exist. Lint code asks to return in human lexicographical order if multiple solutions exist.
@@AhmedSaeed-pm2eythank you, I did not notice the hıman order part there
Quick question: why do you only have to compare every word to the word to its right? Wouldn't there be additional relationships between the first and the fifth word for example. How can you know for sure you aren't missing any relationships by just looking at every adjacent pair of words?
Since we know the words are already in sorted order.
A simple example would be [a, b, c]. Here we know b comes after a (a -> b) and that c comes after b (b -> c).
This tells us a -> b -> c. So it makes no sense to compare a and c.
@@NeetCode Perfect makes sense. Thank you!
I was also confused with the same doubt.
Consider these test cases:
a, bc, bd, ba
a, bc, bd, bde, ed
So, what I understood, if we try to create a test case to break the above assumption, either we end up creating a cycle or the order is preserved and maybe there are some extra edges added if we do for all the pairs.
LincCode further specifies when it's a tie:
4. There may be multiple valid order of letters, return the smallest in normal lexicographical order.
5. The letters in one string are of the same rank by default and are sorted in Human dictionary order.
I did the lintcode version. I started with the strategy used in in CourseSchedule II. I also used two sets putting all the letters in both (noPre, noPost). Then I removed elements from noPre if a letter had a preceding letter and removed from noPost if a letter had post letter.
I combined the noPre and noPost to form a startOrEnd set. Think of it as ends of a rope\thread. I created a list from the startOrEnd set. Sorted the list in lexicographical order. Then did dfs for each letter in the startOrEnd list. I deleted an element from the adjacency list when all it's preceding elements had been processed successfully. If the adjacency list was not empty after processing everything in the startOrEnd list then the input had a loop and a blank string was returned. There was a loop check in the dfs as well.
If you're doing this on Lintcode, there's an extra requirement to return the soln in English lex order if there is more than one soln.
to that end, just run the dfs on the keys in the adjacency in reverse... in a list (since we didn't use ordereddict)
for c in sorted([c for c in adj.keys()], reverse=True):
if dfs(c): return ''
adj is my adjacency dict.
then everything else can stay the same (including the reverse sort before return)
Wow, thank you so much! I was really struggling with the issue, and your solution to it is beautiful
Do we always go with post-order traverse for DAG and reverse the result? Not sure if my professor ever mentioned this. lol
Instead of doing a reverse order, what if we change the direction of edges in our graph? Like similar to course schedule, adding the graph[w2[j]] = w1[j] kind of meaning like a dependency? Then we would be building it in correct order with post order DFS?
yes, I was thinking of the same. It makes more sense to say B->A where A is lexicographically smaller. This way, the post-order DFS seems very intuitive without the need for any reversing.
Thanks for the explanation. You can make 'res' a deque() type and use appendleft(), thereby avoiding the need to reverse the result towards the end.
Also, your solution needs a small fix to work for all test cases in leetcode.
The loop through in line 29, should be for all unique characters in the 'words' parameter and not all keys in the 'adj' dict. This modification handles scenarios like when input is ['z', 'z'].
Just got asked a very similar question in an interview, didn't remember the topo sort at all but somehow managed to do it using a dictionary with letter : unique_letters_after and then
1. Getting out a letter which doesn't appear in the dict values (ie. doesn't appear after any other letter)
2. Removing it from the dictionary
3. Repeat until dict is empty, and at this point you just have to append the last letter, which is a letter that doesn't appear in the dictionary keys
Great solution! I think the difficult part is to understand the problem and build up the adj list. After that the problem basically Course Schedule I or Course Schedule II (topological sort or graph coloring will works) :). Thanks for the explaination
Please update the c++ code for this problem on neetcode website.
Outer for loop from inDegree initialisation is misplaced.
Also neetcode compiler is giving, g++ internal compiler out of space error.
Thank you. You are doing god's work. 🙏🏻
why need to try dfs for every sigle node by using for loop?
damn the code golfing technique with the `visit` dict is lowkey so smart
For those wondering how does it click to do post-order DFS in interviews, if you were to approach this with Kahn's BFS algorithm, you wouldn't have to worry about post-oder DFS.
Why didnt we check if the character is already in the result as a base case for dfs alongside the visited check? Normally we do topological sort like this.
Requesting dungean game and cherry pickup as well! Thanks for the awesome explanation as always.
Would the code pass the test case like ["x", "x"] and the case ["zy","zx"]?
Thanks for sharing this great approach. I have a question regarding this problem.
If we pass in this input, ["zx","zy"], the expected output is "xyz". My question is, how are we able to determine that z comes after x and y?
Thinking about an example alphabetically it can be either before or after which is why I think this input should ideally return invalid as shown below. Let me know your thoughts. Thanks again.
["ad", "af"] --> adf
["zd", "zf] --> dfz
I am facing the same issue. Were you able to figure this out?
You can't determine it. In this case, it's ambiguous where z should be placed and is a sort of "free" character. This is separate from a contradiction (cycle), so you can place z wherever as long as the relative ordering of all "non-free" characters is preserved in your final answer.
In this example, there are multiple possible answers: "xyz", "xzy", and "zxy". Any of them work.
@NeetCode, can you check updated description for this problem, given the description isn't it too hard to be asked in an interview, but has been asked a lot of times
What about this test case ?
{"bac", "bad", "baefgh"}
graph will look like : c -> d -> e
and the topo sort is : c, d, e
But where is f, g, h in the order ????
HELP PLEASE !
Adding all the unique characters in the words to the adjacency list would help in resolving all the edge cases.
I was confused because why didn't we use this method for the course schedule topological sort? we cleared the array once we are done with that. Can we use the same technique here?
i think you should be able to apply this to the course schedule.
Excellent video, thank you. One question: list like "abc", "ab" is not sorted lexicographically, right? The premise is "You are given a list of strings words from the alien language's dictionary, where the strings in words are sorted lexicographically by the rules of this new language." So, the input must be sorted lexicographically. List like "abc", "ab" should not appear in input. Why should we check "if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]:" ? This should never happen, right?
At 17:08, why does line 8 need `len(w1) > len(w2)`?
For anyone else that comes across this:
I believe the answer is because the absence of a character is considered lexicographically *lower* than the presence of any character.
If `len(w1) > len(w2) and w1[:minLen] == w2[:minLen]` then the first word is larger than the second AND the second word is a substring of the first word, yielding an invalid input and no solution (see 4:10).
When we construct a DAG from the input, we may end up with several independent connected components, where each connected component is a single vertex, or a directed sequence of vertices. Within each connected component, for any directed edge, the source vertex is smaller than the sink vertex. However, between the vertices of two connected components, there's no lexicographic relationship, since there isn't an edge between any vertex in either of those two components. In that case, the ordering between vertices of those two components are the same as normal 'human' ordering. This means that when we do the topological sort on keys of the DAG (if you represent the adjacency list as a Map), we need to consider vertices in the 'human' ordering. This ensures that the final result is composed of vertices that, WITHIN their connected components, are ordered in the 'alien' way, but are ordered in the 'human' way ACROSS connected components.
The LeetCode problem states that you can return the ANY valid solution. So you don't need 'human' order across connected components.
I see some people are talking about Lintcode having a different problem statement. If that's what you're referring to then you should say so.
Brilliant solution, Enjoyed the explanation!
How does this work for case ["a","ab"]? This creates and adj list of {a:{} , b:{}}
I might be missing something but looping through each node in adj list without some sort of order could have the result return an answer of [a,b] or [b,a].
Yes, it doesn't pass that test for me too.
Another failing test: ["ab","adc"] returns "cbda", while it should be "abcd"
@@capooti Aren’t “cbda” and “abcd” both valid results for the input [“ab”, “adc”]. The only information [“ab”, “adc”] provides is ‘b’ is somewhere before ‘d’ in the alphabet. Therefore, any result that places ‘b’ somewhere before ‘d’ (and contains all characters found in the provided words) is correct.
@@capooti Should be "abdc" you mean
AB
ADC
A -> B -> D -> C
since lintcode is adding another condition which is if there is multiple valid ordering of the letters then you should return the order with the smallest in lexographical order, i am not sure exactly how to implement it but i think if you were able to force the dfs to run on the letters backward it will work somehow
you could use these lines instead of the lines written at 29 and it will return the values in normal lexographical order as stated in lintcode, i tried it out
for c in sorted(adj.keys(),reverse=True):
if dfs(c):
return ""
In Lintcode, the answer to "zy" and "zx" is "yxz". How do we put the z to the end?
You said we are looping through all the characters because we are doing this in reverse order. I understand how we are doing that as explained in the video (using dfs with post order). The algo works, but I am not understanding how since we loop through all the characters how are we not appending repeated characters?
Just found about your channel, your work is awesome 🔥🔥🔥🔥. Can you make some new playlists on the basis of questions asked by each companies.
Like all these were asked by Facebook or microsoft
How did I not find your channel before, awesome explanation!
post order part is similar to reconstruct itinerary, like finding euler path (hierholzer's algoritm)
At 12:40 we could have gone to node B first also. Then it would have been BCA which, if u reverse it, is ACB which is incorrect. Right?
We cannot process B yet as it has a child C, which is unvisited. So no matter if we go to B or C from A, we still get CBA and reversing will give ABC
Bro this was the answer I was keep looking for, I'm used to implement topological sort using post order BFS and there are bunch of soln out there which use some other technique like inorder to implement topological sort which I don't know. Thanks Man!
Watched this video 10 days ago. And try to redo it now and already forgot some of the details.
Why is there a need for visited to have a false or true value? Can't we just check if the key exists and return false? What's the difference between visited and in path?
the nodes in `visited` are nodes that have ever been touched/visited. The nodes in `path` are nodes that are currently being visited in the recursion stack path.
Could you give some testcases where null string is returned? I cannot seem to wrap my head around the false cases.
I have an interview on Monday and this is their #1 tagged on LC. imma just pray they don't ask me this
Even if I got the topological sort and DFS idea, there's no way I would have figured out why we need to do a postorder DFS. DANG!
Very well explained solution of a hard problem. The explanation of contradiction solution and DFS explanation is very good :)
why would we ever get an input such as ["wrtkj","wrt"] when the input words is meant to be sorted lexicographically. hence I'm unsure why we need:
if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]:
return ""
can someone pls explain. thanks.
Watching the video, I had a question, I was wondering what if one of the strings is bigger and all the characters are the same except the last one of the bigger string, how will we compare them?
neetcode has implemented top sort in course schedule 2, that template is easier to implement. Try that out..
Very nice explantion. Thank you!
Great explanation my friend! I appreciate it!
Very well explained! Thanks a ton for these videos. Appreciate it.
why do we only have to compare "adjacent words" when finding dependencies?
why cant we make the same dependencies from 1st and 3rd node
then 1st and 4th node
and so on...
You can, but it's redundant. Consider the input ["g", "f", "c"]
We know that there's an edge from 'g' -> 'f' and an edge 'f' -> 'c'. Due to their ordering in the input, we know that 'g' < 'c'. But if we add an edge 'g' -> 'c' as well, it'll make no difference since we'll still have to visit 'f' before we visit 'c'. Hence, only comparing adjacent words is more efficient and prevents the runtime from going to quadratic in the number of words as opposed to linear.
Excellent explanation. Awesome channel . Thanks !!!!
anyone have the complexity of this ?
I think it's O(N*M) + O(K+E)
with:
N : number of words
M : max length of words
K : number of node
E : number of edge
good explanation!
Nice solution, topological sort using DFS is slightly not clear to me. Do you have a video specifically explaining too sort using DFS?
Great video but I prefer using Khan's algorithm for this. Its easier to understand and its pretty similar to the solutions you show in course scheduler I & II. Cool that this way to solve it exists though.
why don't you return anything at the end of the dfs method?
for those who are finding it difficult due to use of visited dictionary:
def alienOrder(self, words: List[str]) -> str:
graph = {chr: set() for word in words for chr in word}
for i in range(len(words) - 1):
w1 , w2 = words[i], words[i+1]
minLen = min(len(w1), len(w2))
if len(w1) > len(w2) and w1[:minLen] == w2[:minLen]:
return ""
for j in range(minLen):
if w1[j] != w2[j]:
graph[w1[j]].add(w2[j])
break
print(graph)
output = []
stack = []
def dfs(vertex):
if vertex in stack:
return False
if vertex in output:
return True
stack.append(vertex)
for neighbor in graph[vertex]:
if not dfs(neighbor):
return False
stack.pop()
output.append(vertex)
return True
for chr in graph:
if not dfs(chr):
return ""
return "".join(output[::-1])
it uses stack if it has been popped from stack at the end when all neighbors are visited then it wont exist in that anymore so we cant detect cycle ..i find this much easy...hats off to this guy with the implementation of adjlist and understanding this problem and then thinking this could be solved by topological sorting method with a check of detecting cycle...this is insane bro....
3:10
i couldn't understand the question lol thanks for another great explanation!
Why just the next word in graph construction and not all succeeding words?
Line 6. w2 should be from i to len(words) -1, right?
Nice Explanation, but i have one question in DFS method, for visit array why we are making True before False.
So that when we call dfs on its neighbors we can detect if a cycle is found by visiting this node again. Once we have checked all the neighbors we know we havent found a cycle so we can set it back to false to remove it from the path of nodes we are checking
neetcode I love your videos and solutions but for this one i gotta say i think the Leetcode Premium solution using Kahn's Algorithm is far more intuitive. It just makes more sense for me
@NeetCode, hello. Thanks for the sharing. I got one more question. The problem said "If there are multiple solutions, return any of them", what about return only the one as the earth english order like abcdefg...? what should we need to do?
you are wayy too awesome!!
Thanks for post order dfs, even though Bfs is easier in this question.
Anyone know what time/space complexity this is mind sharing? I'm confident my idea isn't correct. Thanks!
15:31 he said it's the number of characters in the input for time and i think it's the same for space too :))
U an Alien God. Setting the visit[c] = True and then to visit[c] = False, seems like a backtracking situation. But it isn't. Since our visit[c] was not set to False initially.
the explanation is wrong, for example for the test case
["wrt","wrf"]
the expected output is
"rtfw"
This means that only the first character and the length matter and the other chars do not, in the explanation he traverses the string like the other characters matter. I think this video is deprecated.
In the test case you provide, the only character of difference is the last one. So given the ordering, you can use any solution that has "t" before "f". For example, "rwtf" and "tfrw" will also pass that test case. It is a problem that may have multiple solutions like so.
BFS seemed more intuitive here for me
jfc. my interview is in a month and questions like this destroy my confidence 😤
Hi, for input [a,ab], ur code will give out "ba" but the answer is "ab", could explain a bit plz?
Finish watching the video. The string is reversed before returning
@@avenged7ex With this implementation, for test case ["a", "ab"] the result will return "ba" after it is reversed. Ran it in leetcode as a test case and "ba" is still accepted as an answer which is strange as I would think it would need to be "ab"
Looked at the LC discussion and realized I've misunderstood the description. For those reading this and wondering the same question, the description said the words are sorted lexicographically, not the individual letters. In this example ["abs", "add"], the word "add" does not imply the ordering a < d.
@@Allan-ti5uu Brilliant! Thanks, what a twisted question. Now coming to it, do Facebook engineers expect someone to solve it if they've never done this before in their lives XD
for the prefix check, is leetcode the same as lintcode where: The dictionary is invalid, if string a is prefix of string b and b is appear before a.
and can string a and b appear anywhere in words i.e. [a, ........, b] so a pairwise check for len(w1) > len(w2) and w1[:minLen] == w2[:minLen] is insufficient as you need to check every word with every other?
Thank you for this.
Hard became Easy for me after watching this
when he said this was intuitive, he was lying.
Great explanation!!
which school is it thought?
your childhood was awesome
After watching your explanation, I feel like a big dumb dumb for not thinking to model this as a graph
Is there any way to create a Lintcode account without Wechat? Can someone with a wechat account help me activate it
can you please explain stone game showing min max algo?
This is too hard for me now😶🌫
Fails for [“z”, “z”]
would changing the if check in line 8 to >= solve this ? I would say this is an invalid test case or a cycle probably.
You can fix this by making following modification:
- In line 29, loop through all unique characters found in the 'words' list
all_chars = set()
for word in words:
for char in word:
all_chars.add(char)
for char in all_chars:
if dfs(char):
return ""
bfs is actually more intuitive and cleaner:
```
class Solution:
def alienOrder(self, words: List[str]) -> str:
adj_list = collections.defaultdict(set)
in_degree = Counter({c:0 for word in words for c in word})
for i in range(len(words)-1):
word = words[i]
next_word = words[i+1]
for char1, char2 in zip(word, next_word):
if char1 != char2:
if char2 not in adj_list[char1]:
adj_list[char1].add(char2)
in_degree[char2] += 1
break
else:
if len(next_word) < len(word):
return ""
queue = collections.deque([c for c in in_degree if in_degree[c] == 0])
output = []
while queue:
char = queue.popleft()
output.append(char)
for d in adj_list[char]:
in_degree[d] -= 1
if in_degree[d] == 0:
queue.append(d)
if len(output) < len(in_degree):
return ""
return "".join(output)
```
this is so clever
this can also be represented with emoji's instead of alphanumeric characters such as the following:
1. 🧠❤🚀 🧠
2. 🧠❤ 🧠 🚀
3. 🧠❤ 🧠 🚀 🧠
4. ❤🚀