Great talk, and I'm a little confused by the "why do we need any of this", where an example of GPT-3 is given. What makes GPT-3 different from any statistical learning models? It says that a big network can learn the causal structure, but how many assumptions are required for this neural network that only learns conditional probabilities? From Judea Pearl's view it is still on the "first rung of the causal ladders" and cannot do interventional or counterfactual reasoning, so it seems that causality is necessary for human-level AI?
Yes, Elias Barenboim even has a theoritical result using measure theory to show how observational models can't climb up the ladder of causation and GPT-3 being an observational data based model it can't either. But I guess the point here is only about disentangled representation being learnt in a distributed format and such knowledge might not be enough to reach human-level intelligence and I don't think human-level AI was being referred to here and rather only a practically useful model for some tasks.
Great talk, and I'm a little confused by the "why do we need any of this", where an example of GPT-3 is given.
What makes GPT-3 different from any statistical learning models? It says that a big network can learn the causal structure, but how many assumptions are required for this neural network that only learns conditional probabilities? From Judea Pearl's view it is still on the "first rung of the causal ladders" and cannot do interventional or counterfactual reasoning, so it seems that causality is necessary for human-level AI?
Yes, Elias Barenboim even has a theoritical result using measure theory to show how observational models can't climb up the ladder of causation and GPT-3 being an observational data based model it can't either. But I guess the point here is only about disentangled representation being learnt in a distributed format and such knowledge might not be enough to reach human-level intelligence and I don't think human-level AI was being referred to here and rather only a practically useful model for some tasks.