Best talk on the realities of ai that address the modern architectures and their fundamental flaws + (more importantly) alternative architectures that address these flaws quite clearly. This is the MUST WATCH video for ai practitioners and ai curious.
here's a ChatGPT summary: - Welcome to the last distinguished lecture series for the Institute of Experimental AI for the academic year - Introducing Yann LeCun, VP and Chief AI Scientist at META, Silver Professor at NYU, and recipient of the 2018 ACM Turing Award - Overview of current AI systems: specialized and brittle, don't reason and plan, learn new tasks quickly, understand how the world works, but don't have common sense - Self-supervised learning: train system to model its input, chop off last few layers of neural net, use internal representation as input to downstream task - Generative AI systems: autoregressive prediction, trained on 1-2 trillion tokens, produce amazing performance, but make factual errors, logical errors, and inconsistencies - LLMs are not good for reasoning, planning, or arithmetics, and are easily fooled into thinking they are intelligent - Autoregressive LLMs have a short shelf life and will be replaced by better systems in the next 5 years. - Humans and animals learn quickly because they accumulate an enormous amount of background knowledge about how the world works by observation. - AI research needs to focus on learning representations of the world, predictive models of the world, and self-supervised learning. - AI systems need to be able to perceive, reason, predict, and plan complex action sequences. - Hierarchical planning is needed to plan complex actions, as the representations at every level are not known in advance. - Predetermined vision systems are unable to learn hierarchical representations for action plans. - AI systems are difficult to control and can be toxic, but a system designed to minimize a set of objectives will guarantee safety. - To predict videos, a joint embedding architecture is needed, which replaces the generative model. - Energy based models are used to capture the dependency between two sets of variables, and two classes of methods are used to train them: contrastive and regularized. - Regularized methods attempt to maximize the information content of the representations and minimize the prediction error. - LLMs are a new method for learning features for images without having to do data augmentation. - It works by running an image through two encoders, one with the full image and one with a partially masked image. - A predictor is then trained to predict the full feature representation of the full image from the representation obtained from the partial image. - LLMs are used to build world models, which can predict what will happen next in the world given an observation about the state of the world. - Self-supervised learning is the key to this, and uncertainty can be done with an energy-based model method. - LLMs cannot currently say "I don't know the answer to this question" as opposed to attempting to guess the right answer. - Data curation and human intervention through relevance feedback are critical aspects of LLMs that are not talked about often. - The trend is heading towards bigger is better, but in the last few months, smaller systems have been performing as well as larger ones. - The model proposed is an architecture where the task is specified by the objective function, which may include a representation of the prompt. - The inference procedure that produces the output is separated from the world model and the task itself. - Smaller networks can be used for the same performance. - AI and ML community should pivot to open source models to create a vibrant ecosystem. - Biggest gaps in education for AI graduates are in mathematics and physics. - Open source models should be used to prevent control of knowledge and data by companies. - LLMs are doomed and understanding them is likely to be hopeless. - Self-supervised learning is still supervised learning, but with particular architectures. - Reinforcement learning is needed in certain situations. - Yann discussed the idea of amortized inference, which is the idea of training a system to approximate the solution to an optimization problem from the specification of the problem. - Yann believes that most good ideas still come from academia, and that universities should focus on coming up with new ideas rather than beating records on translation. - Yann believes that AI will have a positive impact on humanity, and that it is important to have countermeasures in place to prevent the misuse of AI. - Yann believes that AI should be open and widely accessible to everyone.
@56:00 it is because the CNN approach is not the same as statistical curve fitting, it is more akin to Fourier decomposition (to take an overly simplistic analogy). The big difference with a FFT is that a specialized CNN is basically an encoding of a whole giant bundle of generalized Fourier or Daubechies decompositions. An ai task space is vast, so you cannot find a complete set of orthogonal states, but the CNN allows enough of a basis in some cases, those are the cases where NN's work. The statistical algorithm aspect is the search to find a reasonable decomposition in the task space for a given request/prompt/whatever.
Prof. LeCun and Meta AI have been foundational in the field. A voice of reason. But LLMs do show an ability to reason look at Voyager by NVIDIA and MM-REACT from Northeastern. Agility Robotics and Google have even showcased their reasoning abilities. But we have a ways to go. There is most likely a fundamental shift in architecture coming soon, like LeCun said.
LeCun doesn't acknowledge that if you prompt GPT-4 to come up with a plan and explain itself step-by-step, it does so! People are getting 20% improvement on its already college-level performance in all kinds of tasks by coming up with more and more elaborate instructions, like telling it to generate multiple candidate responses, and then become a professor reviewing all the responses looking for the one that has the least errors. Despite their flaws, LLMs DO respond as if they plan, reason, and think. Yan LeCun is criticizing the leaders from the vantage point of Meta which has fallen behind.
@skierpage it's a competitive environment, but Yan is right about AGI not being here. His new idea of masking images and new architecture might have some serious advantages. We will have to see how it plays out. I completely agree GPT-4 is already a huge game changer! Hotz publicly stated it's an eight headed mixture model of experts and kinda shrugged it off like it's not that cool, so why didn't anyone else do it... Not only has OpenAI bright AI to a larger audience and boosted global productivity, the most important thing they excite venture capital back into AI and get Google, Meta, Nvidia, and Microsoft to alter their business strategies to true AGI research. We will look back and say that GPT-4 and OpenAI were the match to really light the fire in the AGI race.
La réponse est dans le modèle du vivant, il manque encore une structure importante. Je pense que chat gpt est bon car les tokens ont les bonnes propriétés pour modéliser les idées ou les pensées. Il manque la bonne structure pour modéliser l'interaction avec l'environnement pour faire évoluer vers l'agi. C'est la grande question en modélisation, il faut un objet mathématique qui a les bonnes propriétés. Dans les systèmes complexes c'est le monoide generateur du topos qui permet de tout dénouer et ensuite de recouper des domaines qui paraissent très éloignés à priori.
1:17:25 "at some point it will be too big for us to comprehend" Before that point is reached we should have figured out alignment, not having a blackbox system so we can actually see whats going on in there and a ton of societal changes that will have to be made for societies to be/stay stable.
With respect to LLMs, the evidence suggests that they do know a great deal about how the world works. For example, GPT-like models' weights can be/are trained on data that actually sub-sum all that is currently known about physical laws, chemistry, biology etc. through countless papers, review articles, and textbooks at various levels of sophistication, from 1st-grade level through cutting edge research. The fact that these are given as text (language) is not as problematic, since it appears that the relevant written record is sufficient to explain and convey the current and past knowledge in these subjects. That multi-layer transformers learn context and correlations between (meaningful!) words and their associated concepts and relationships should not to be underestimated. That the models "just produce" the next probable token isn't conceptually trivial either, if one considers that, for example, most of physics can be described through (partial) differential equations that can be integrated step by step, where the context/state-dependent coefficients of the equations (-the trained weights of the network-) ultimately result from the underlying theories these equations are solving. Processing the current state, with these coefficients in context, to predict and specify what happens next, one step at the time, is how these equations are in practice numerically integrated. So what we potentially may have with the current LLMs are models that learn from language and words, that actually do describe in excruciating detail what is known to man, and proceed to "auto-complete", in analogous ways to the best methods used to solve the currently known equations of Science.
Knowing of them and understanding how to use them maybe still not be the same. There was a study about alphaGo that took place after the fact alphaGo beat the greatest Go players. In that study they gave the alphaGo a muligan/an advantage of a few turns that would be given to childreen that are starting out to learn the game. The result was that every single game then alphaGo lost. The researchers who analyzed that then came to the conclusion, the strategy of engulving the enemies pieces was not really understood conceptionally. I think it is at times hard to differentiate what we see as a result and outcome, the product of the calculations and then also correctly make assumptions about how the LLMs got there. After all it is said they are a blackbox system abd researchers will still take a while to exactly figure out what is going on in there within their neural networks. On the other hand we humans tend to put ourselves on a pedestal and make us something special. The image we have of ourselves is often quite inflated. Which therefor could lead to missinterpretations and underestemating what is going on. We maybe all be partial to Dunning Krueger Effect, anthropomorphisation and other phsycological traps. Understanding and admiting that to ourselves seems key for many of my own issues i have at times with other people and while that in itself may open up another trap, to think others have the same issues, i still think it is safe to assume that most do. Longstory short, not just the hype but a sort of panic that set in the past year around the topic of AGIs and LLMs, seemed to have more to do with our human failings and how we would use such technology than it already being on its own the biggest threat to humanity. Still it is a wakeup call of the direction this takes and what we can expect in the not too distant future.
@@kinngrimm the biggest threat isn’t the technology itself but how it is used, more accurately who is using it and for what? The answer to which is rich corporate elites to replace us, rich government elites to control us, and rich military industrial elites to kill us. Examples of each are already prevalent throughout the world, meanwhile people are distracted with the notion that the technology itself is the danger, and they are failing to focus on the real threats.
22:30 I believe that human ability to interact with the world is fundamental to our learning speed. We automatically mine hard examples around us. People have an internal model of the world, and it constantly evaluates what happens around us. If it is something we expect, we find it boring and if it is unexpected we find it interesting (that is very roughly; there are other rewards in play, e.g. hunger, thirst, sex drive, etc.). By seeking new interesting experiences we constantly improve our model of the world. Machines cannot do that, they have to learn from the data given to them, which can start being repetitive very quickly. The amount of novel data goes down very quickly. So much data is required in part because the amount of new information goes down exponentially if you just collect everything.
it also seems we create our own training data, forming opposing views in our head, weighing pro's and cons and constantly trying to opt for the optimum position. This also fits well with the fact that the most intelligent people around you are often those that self reflect the most.
about the singularity As Ray Kurzweil says When the whole universe becomes a computer, What does it calculate? Even though the purpose for calculating has already disappeared?
Finally someone understanding why LLMs are bound to fail. It's unbelievable how people are underestimating the difficulty of building a cognitive architecture. Literally people are expecting that a quick, polynomial algorithm like a feed forward neural net can solve all problems of humanity. Yet, logicians have already explained the concept of NP-hardness, that is, logical problems can't be solved in general efficiently by a machine, no matter how sophisticated it is. In some sense, scientific problems don't have "patterns", they're all different, so a machine learning patterns is pretty useless. That's why progress is slow and to even be possible it takes billions of intelligent brains in parallel and with incredibly structured communication. So good luck with LLMs...
They left the cnn ship and now travelling in the transformer ship. We are now asking them to leave the transformer ship and onboard into the ssl ship. Not gonna happen imo, llm or transformers in general, which was conspicuously unnoticeable in the talk, are finding their sweet spots in a very large space of business applications where they tend to perform very well to solve some (non-virtual) real problems. Its like asking users to leave search engine to be on social network.
Coincidentally, I was just discussing a similar concept with ChatGPT. A system where multiple agents run parallel tasks, integrated and analyzed for verification and optimization, before reaching a consensus and producing a result. This would allow a Digital Intelligence the capacity for developing novel solutions, self-learning, and ability to fact-check itself. Neuro-Synthesis Heuristic Architecture, or NeurSHArc.
Yann is mentioning 1:09:16 that a lot of the mathematics of neural networks comes from statistical physics, but I wonder what mathematics he's referring to, since most of the mathematics I've seen when I learned statistical physics was much more basic than some of the mathematics I've seen by the likes of Yi Ma and Le Cun.
@@edz8659 I never learned anything about reverse diffusion in my statistical physics courses. Neither did we learn about stochastic differential equations for example. I actually learned more about Brownian motion and Wiener processes when I worked as a quant.
@@nicktasios1862Brownian motion is statiscal physics, and spin glasses and entropy are a good bridge between phase transitions (statistical physics) and decision boundaries in data spaces
34:54 "AI is not going to kill us all. Or, we would have to screw up pretty badly for that to happen." Hey Yann, have you met human beings? Or the history of science?
@9:30 _representing_ meaning (syntactics conveying semantics) is not understanding meaning. It is a transfer. It is gross when otherwise good scientists anthropomorphise machine behaviour. Behaviour is not thought, thought is not pure behaviour. Generativity is not subjective creativity. Deterministic automatons can behave generatively.
When llm are put into ensemble with databases they can be made factual, actually. The llm is good at fusing query results is the reason. When llm are put into ensemble with strategy specialist models they can be made into planners, actually. The Alpha family of models is a planner. When llm are augmented with persistent storage they can be made to remember their learning s. The llm alone is not the way forward, but the llm With various augmentation s seems very promising.
The fact that humans/animals learn or think differently from ML models like LLMs doesn't mean those models can't surpass human level....LeCun got stuck with the idea that best solutions stem from our understanding of a brain....it worked with computer vision (CNNs), but who said it should apply to everything else.
My question to Yann would have been "I'm an idiot. If I offered you a job here and now without me being able to give you practically anything you want in return, would you come and work for me as my super intelligence?"
Exactly! How naive are people in this field who trusts the version of the future where we successfully enslave an intelligence 100.000X smarter than us? How dumb can people possibly be??
Summary from ChatGPT : Yann LeCun discusses the state of AI, the limitations of current machine learning systems, and the potential pathway to more intelligent machines. He highlights the success of self-supervised learning and generative text models, but also mentions their shortcomings in factual accuracy and reasoning. Highlights 🤖 Self-supervised learning, using prediction to train a system to represent data, is widely successful in natural language understanding. 📝 Generative text models, trained on trillions of tokens and using auto-regressive prediction, can generate plausible but often factually inaccurate text. 🧠 To achieve more intelligent machines, we need to move beyond machine learning to systems that can reason, plan, and understand the world.
I am amazed how certain he is that safety can be encoded, and yet in the first answer he explains how the current theory was not "all knowing". Sounds very arrogant to me. Arrogant to almost ignorant, reckless
He is so naive, he truly believes that we can successfully enslave an intelligence 100.000X smarter than us. And yet he has no theoretical explanation for how he assumes we might be able to do so while beeing at the bleeding edge of AI development. People like him will accelerate the extinction of our species
@@albertodelrio5966 I heard he's being viciously attacked for his frenchspeak. Also all the LLM's he's part of building will give regular random interjections of "baguette" and "omelette du fromage"
@@albertodelrio5966anyone who doesn’t bow to the supreme gentooman and doomer chief Eliezer Yudkowsky is apparently fallen off and not even a real scientist
I like LeCun, but he should take a tip from me: you are not going to get machine AGHI. Billions of years of evolution could not get a single other species like humans. We are (obviously) spiritual beings. You have to ask, as an ai scientist (not engineer, who have no clue) what good can the smartest animal get you? A. a lot, but limited scope. The future (i predict, fwiw, not much likely, no such prediction is worth much) is that highly specialised ai tools are the way to go. Humans will be enhanced with a bundle of such tools. Like cavemen became advanced with fire, ploughs and spears (weak analogy only, but I'd say about right). More tool good. Get rid old tool. More time play. Me want hack. More time sex.
Your tip is wrong. Evolution doesn't try to build intelligence, it's just certain genetic sequences getting reproduced slightly more each generation. People are explicitly building machines to be smart, and sharing most of what they learn with other researchers so that improvements spread on a monthly basis. The success is phenomenal.
I sense tremendous danger ahead for humanity. Ai crime and losing your job to robots ai agents and plug-ins is unacceptable. Ai jobloss is here. So are Ai as weapons. Can we please find a way to cease Ai / GPT? Or begin pausing Ai before it’s too late?
Yann's Hierarchical planning model is nonsense. Yann has fallen into the trap of believing he personally needs to understand and map the process himself rather than allow the AI system do it. The important feature of AI planning is that the AI needs to feedback on itself and evaluate its initial "thought", not have some comparison to another "world model". I see the process as being much simpler. Feed forward with the original question and obtain an answer. Take that answer, the original question, and an instruction to analyse and improve on the answer producing a further answer. Repeat using the improved answer until processing reaches some threshold. Do a number of these in parallel. Then pass the question and the answers into the model with the instruction to evaluate them and choose the best one There isn't even a need to re-tokenise, use the vectors directly on subsequent passes. This architecture should also work for questions that naturally need a number of steps to evaluate like a fair bit of mathematics. We know that you can refine an answer by having a conversation and this possible architecture automates that to some extent. They may already be doing something similar to this is more recent models.
@Franky Vincent The whole argument is based on the fact that you can ask a LLM such as ChatGPT to improve on its answer, and it almost always does. Demonstrably. Feed forward once isn't optimal.
Let's try your approach. Prompt: solve nuclear fusion. Answer: it's a very difficult problem, I can't provide a solution. Prompt: improve your answer. Answer: it's very, very difficult, I can't provide an answer. Prompt: improve your answer. Answer: it's very, very, very difficult, I can't provide an answer................. Out of metaphor, if your basic neural net is incapable of thinking, it will at each step provide a shallow idea. So when it comes to significant problems that require innovative thinking, GPT will go through an infinite loop.
@federicoaschieri They're not actual responses, are they. Like all questions, if the information doesn’t exist to formulate an answer, then an answer can't be formulated. Experimentation and acquiring new knowledge is the best path forward for the time being.
@@medhurstt For every interesting open problem out there we don't have the information to solve it straightforwardly. That's why your idea is not very useful, and one has to build sophisticated cognitive architectures, like LeCun explained. Many years of research are still needed to figure them out.
It's astonishing how people have the audacity to say that a very epensive and overhyped web scraper's can lead to AGI , In what world ? roaches or ants ? Maybe but not in the real human world, At least not the Intelligence we referre to in the text books, No matter how many paid lunatics like Eliezer or others like him tries to create hypes and scarcity so more people can use them and companies can make large profits.
Best talk on the realities of ai that address the modern architectures and their fundamental flaws + (more importantly) alternative architectures that address these flaws quite clearly. This is the MUST WATCH video for ai practitioners and ai curious.
here's a ChatGPT summary:
- Welcome to the last distinguished lecture series for the Institute of Experimental AI for the academic year
- Introducing Yann LeCun, VP and Chief AI Scientist at META, Silver Professor at NYU, and recipient of the 2018 ACM Turing Award
- Overview of current AI systems: specialized and brittle, don't reason and plan, learn new tasks quickly, understand how the world works, but don't have common sense
- Self-supervised learning: train system to model its input, chop off last few layers of neural net, use internal representation as input to downstream task
- Generative AI systems: autoregressive prediction, trained on 1-2 trillion tokens, produce amazing performance, but make factual errors, logical errors, and inconsistencies
- LLMs are not good for reasoning, planning, or arithmetics, and are easily fooled into thinking they are intelligent
- Autoregressive LLMs have a short shelf life and will be replaced by better systems in the next 5 years.
- Humans and animals learn quickly because they accumulate an enormous amount of background knowledge about how the world works by observation.
- AI research needs to focus on learning representations of the world, predictive models of the world, and self-supervised learning.
- AI systems need to be able to perceive, reason, predict, and plan complex action sequences.
- Hierarchical planning is needed to plan complex actions, as the representations at every level are not known in advance.
- Predetermined vision systems are unable to learn hierarchical representations for action plans.
- AI systems are difficult to control and can be toxic, but a system designed to minimize a set of objectives will guarantee safety.
- To predict videos, a joint embedding architecture is needed, which replaces the generative model.
- Energy based models are used to capture the dependency between two sets of variables, and two classes of methods are used to train them: contrastive and regularized.
- Regularized methods attempt to maximize the information content of the representations and minimize the prediction error.
- LLMs are a new method for learning features for images without having to do data augmentation.
- It works by running an image through two encoders, one with the full image and one with a partially masked image.
- A predictor is then trained to predict the full feature representation of the full image from the representation obtained from the partial image.
- LLMs are used to build world models, which can predict what will happen next in the world given an observation about the state of the world.
- Self-supervised learning is the key to this, and uncertainty can be done with an energy-based model method.
- LLMs cannot currently say "I don't know the answer to this question" as opposed to attempting to guess the right answer.
- Data curation and human intervention through relevance feedback are critical aspects of LLMs that are not talked about often.
- The trend is heading towards bigger is better, but in the last few months, smaller systems have been performing as well as larger ones.
- The model proposed is an architecture where the task is specified by the objective function, which may include a representation of the prompt.
- The inference procedure that produces the output is separated from the world model and the task itself.
- Smaller networks can be used for the same performance.
- AI and ML community should pivot to open source models to create a vibrant ecosystem.
- Biggest gaps in education for AI graduates are in mathematics and physics.
- Open source models should be used to prevent control of knowledge and data by companies.
- LLMs are doomed and understanding them is likely to be hopeless.
- Self-supervised learning is still supervised learning, but with particular architectures.
- Reinforcement learning is needed in certain situations.
- Yann discussed the idea of amortized inference, which is the idea of training a system to approximate the solution to an optimization problem from the specification of the problem.
- Yann believes that most good ideas still come from academia, and that universities should focus on coming up with new ideas rather than beating records on translation.
- Yann believes that AI will have a positive impact on humanity, and that it is important to have countermeasures in place to prevent the misuse of AI.
- Yann believes that AI should be open and widely accessible to everyone.
You could have got it to include timestamps, particularly as they haven’t published this with chapters.
@@StoutProper By all means go ahead and do it.
@@RufusShinra you’ve already fed it the transcript complete with timestamps. Just instruct it to add a timestamp
@@StoutProper I didn't do Jack :D i'm not the OP
@@RufusShinraembarrassing
@56:00 it is because the CNN approach is not the same as statistical curve fitting, it is more akin to Fourier decomposition (to take an overly simplistic analogy). The big difference with a FFT is that a specialized CNN is basically an encoding of a whole giant bundle of generalized Fourier or Daubechies decompositions. An ai task space is vast, so you cannot find a complete set of orthogonal states, but the CNN allows enough of a basis in some cases, those are the cases where NN's work. The statistical algorithm aspect is the search to find a reasonable decomposition in the task space for a given request/prompt/whatever.
Is his deck available anywhere?
It would be very helpful if there was a list of open problems in machine learning space
Ask gpt or bing
I think that this guy makes the most sense relative to LLM’s and their inability to plan, reason, or think.
Prof. LeCun and Meta AI have been foundational in the field. A voice of reason. But LLMs do show an ability to reason look at Voyager by NVIDIA and MM-REACT from Northeastern. Agility Robotics and Google have even showcased their reasoning abilities. But we have a ways to go. There is most likely a fundamental shift in architecture coming soon, like LeCun said.
LeCun doesn't acknowledge that if you prompt GPT-4 to come up with a plan and explain itself step-by-step, it does so! People are getting 20% improvement on its already college-level performance in all kinds of tasks by coming up with more and more elaborate instructions, like telling it to generate multiple candidate responses, and then become a professor reviewing all the responses looking for the one that has the least errors.
Despite their flaws, LLMs DO respond as if they plan, reason, and think. Yan LeCun is criticizing the leaders from the vantage point of Meta which has fallen behind.
@skierpage it's a competitive environment, but Yan is right about AGI not being here. His new idea of masking images and new architecture might have some serious advantages. We will have to see how it plays out.
I completely agree GPT-4 is already a huge game changer! Hotz publicly stated it's an eight headed mixture model of experts and kinda shrugged it off like it's not that cool, so why didn't anyone else do it... Not only has OpenAI bright AI to a larger audience and boosted global productivity, the most important thing they excite venture capital back into AI and get Google, Meta, Nvidia, and Microsoft to alter their business strategies to true AGI research.
We will look back and say that GPT-4 and OpenAI were the match to really light the fire in the AGI race.
so exciting to see Yan at EAI!!
@46:55 What does R(*) stand for? What kind of operator or operation is supposed to represent?
Excellent Q & A!
La réponse est dans le modèle du vivant, il manque encore une structure importante.
Je pense que chat gpt est bon car les tokens ont les bonnes propriétés pour modéliser les idées ou les pensées. Il manque la bonne structure pour modéliser l'interaction avec l'environnement pour faire évoluer vers l'agi. C'est la grande question en modélisation, il faut un objet mathématique qui a les bonnes propriétés. Dans les systèmes complexes c'est le monoide generateur du topos qui permet de tout dénouer et ensuite de recouper des domaines qui paraissent très éloignés à priori.
1:17:25 "at some point it will be too big for us to comprehend"
Before that point is reached we should have figured out alignment, not having a blackbox system so we can actually see whats going on in there and a ton of societal changes that will have to be made for societies to be/stay stable.
With respect to LLMs, the evidence suggests that they do know a great deal about how the world works. For example, GPT-like models' weights can be/are trained on data that actually sub-sum all that is currently known about physical laws, chemistry, biology etc. through countless papers, review articles, and textbooks at various levels of sophistication, from 1st-grade level through cutting edge research. The fact that these are given as text (language) is not as problematic, since it appears that the relevant written record is sufficient to explain and convey the current and past knowledge in these subjects. That multi-layer transformers learn context and correlations between (meaningful!) words and their associated concepts and relationships should not to be underestimated. That the models "just produce" the next probable token isn't conceptually trivial either, if one considers that, for example, most of physics can be described through (partial) differential equations that can be integrated step by step, where the context/state-dependent coefficients of the equations (-the trained weights of the network-) ultimately result from the underlying theories these equations are solving. Processing the current state, with these coefficients in context, to predict and specify what happens next, one step at the time, is how these equations are in practice numerically integrated. So what we potentially may have with the current LLMs are models that learn from language and words, that actually do describe in excruciating detail what is known to man, and proceed to "auto-complete", in analogous ways to the best methods used to solve the currently known equations of Science.
Knowing of them and understanding how to use them maybe still not be the same. There was a study about alphaGo that took place after the fact alphaGo beat the greatest Go players. In that study they gave the alphaGo a muligan/an advantage of a few turns that would be given to childreen that are starting out to learn the game. The result was that every single game then alphaGo lost. The researchers who analyzed that then came to the conclusion, the strategy of engulving the enemies pieces was not really understood conceptionally.
I think it is at times hard to differentiate what we see as a result and outcome, the product of the calculations and then also correctly make assumptions about how the LLMs got there. After all it is said they are a blackbox system abd researchers will still take a while to exactly figure out what is going on in there within their neural networks.
On the other hand we humans tend to put ourselves on a pedestal and make us something special. The image we have of ourselves is often quite inflated. Which therefor could lead to missinterpretations and underestemating what is going on. We maybe all be partial to Dunning Krueger Effect, anthropomorphisation and other phsycological traps. Understanding and admiting that to ourselves seems key for many of my own issues i have at times with other people and while that in itself may open up another trap, to think others have the same issues, i still think it is safe to assume that most do.
Longstory short, not just the hype but a sort of panic that set in the past year around the topic of AGIs and LLMs, seemed to have more to do with our human failings and how we would use such technology than it already being on its own the biggest threat to humanity. Still it is a wakeup call of the direction this takes and what we can expect in the not too distant future.
@@kinngrimm the biggest threat isn’t the technology itself but how it is used, more accurately who is using it and for what? The answer to which is rich corporate elites to replace us, rich government elites to control us, and rich military industrial elites to kill us. Examples of each are already prevalent throughout the world, meanwhile people are distracted with the notion that the technology itself is the danger, and they are failing to focus on the real threats.
Open Source foundation models are the future of democracy and small business development
22:30 I believe that human ability to interact with the world is fundamental to our learning speed. We automatically mine hard examples around us.
People have an internal model of the world, and it constantly evaluates what happens around us. If it is something we expect, we find it boring and if it is unexpected we find it interesting (that is very roughly; there are other rewards in play, e.g. hunger, thirst, sex drive, etc.). By seeking new interesting experiences we constantly improve our model of the world.
Machines cannot do that, they have to learn from the data given to them, which can start being repetitive very quickly. The amount of novel data goes down very quickly. So much data is required in part because the amount of new information goes down exponentially if you just collect everything.
it also seems we create our own training data, forming opposing views in our head, weighing pro's and cons and constantly trying to opt for the optimum position. This also fits well with the fact that the most intelligent people around you are often those that self reflect the most.
@@daarom3472hence why depression isn’t an affliction of idiot
about the singularity
As Ray Kurzweil says
When the whole universe becomes a computer,
What does it calculate?
Even though the purpose for calculating has already disappeared?
The universe is already a computer.
Are the slides available online ?
drive.google.com/file/d/1RFxtgLv0q_tKmfzxWG0IOAL8Bre4Id6B/view
Finally someone understanding why LLMs are bound to fail. It's unbelievable how people are underestimating the difficulty of building a cognitive architecture. Literally people are expecting that a quick, polynomial algorithm like a feed forward neural net can solve all problems of humanity. Yet, logicians have already explained the concept of NP-hardness, that is, logical problems can't be solved in general efficiently by a machine, no matter how sophisticated it is. In some sense, scientific problems don't have "patterns", they're all different, so a machine learning patterns is pretty useless. That's why progress is slow and to even be possible it takes billions of intelligent brains in parallel and with incredibly structured communication. So good luck with LLMs...
They left the cnn ship and now travelling in the transformer ship. We are now asking them to leave the transformer ship and onboard into the ssl ship. Not gonna happen imo, llm or transformers in general, which was conspicuously unnoticeable in the talk, are finding their sweet spots in a very large space of business applications where they tend to perform very well to solve some (non-virtual) real problems. Its like asking users to leave search engine to be on social network.
Coincidentally, I was just discussing a similar concept with ChatGPT. A system where multiple agents run parallel tasks, integrated and analyzed for verification and optimization, before reaching a consensus and producing a result. This would allow a Digital Intelligence the capacity for developing novel solutions, self-learning, and ability to fact-check itself. Neuro-Synthesis Heuristic Architecture, or NeurSHArc.
Shame on the recording engineer for screwing up the mic'ing of theerson making the introduction. So much distortion 🙉
Yann is mentioning 1:09:16 that a lot of the mathematics of neural networks comes from statistical physics, but I wonder what mathematics he's referring to, since most of the mathematics I've seen when I learned statistical physics was much more basic than some of the mathematics I've seen by the likes of Yi Ma and Le Cun.
Reverse diffusion for one
@@edz8659 I never learned anything about reverse diffusion in my statistical physics courses. Neither did we learn about stochastic differential equations for example. I actually learned more about Brownian motion and Wiener processes when I worked as a quant.
I would The statical tools are from quantum physics... Not mechanical physics..
@@nicktasios1862Brownian motion is statiscal physics, and spin glasses and entropy are a good bridge between phase transitions (statistical physics) and decision boundaries in data spaces
Configurator is the soul, the spirit! :D in case of AIs the configurator is a human operator 😇
A great analogy!
34:54 "AI is not going to kill us all. Or, we would have to screw up pretty badly for that to happen."
Hey Yann, have you met human beings? Or the history of science?
I think his point was: machines are so far off from being human level intelligent, that it's like being worried that bears will kill us all.
Amazing and insightful. We shouldn't fall after the fear mongering hype. Thank you!
@9:30 _representing_ meaning (syntactics conveying semantics) is not understanding meaning. It is a transfer. It is gross when otherwise good scientists anthropomorphise machine behaviour. Behaviour is not thought, thought is not pure behaviour. Generativity is not subjective creativity. Deterministic automatons can behave generatively.
A masterclass
When llm are put into ensemble with databases they can be made factual, actually. The llm is good at fusing query results is the reason. When llm are put into ensemble with strategy specialist models they can be made into planners, actually. The Alpha family of models is a planner. When llm are augmented with persistent storage they can be made to remember their learning s. The llm alone is not the way forward, but the llm With various augmentation s seems very promising.
Merci BOKU - fully agree ! humans have sometimes (not always) conceptual understanding and common sense, we call this Hausverstand in German
*beaucoup
@@skierpage thanks - my universities acronym is BOKU - therefore this nice wordplay :-)
I think the 'configurator' in LeCun's architecture for Autonomous AI is perhaps better called an 'orchestrator'.
The fact that humans/animals learn or think differently from ML models like LLMs doesn't mean those models can't surpass human level....LeCun got stuck with the idea that best solutions stem from our understanding of a brain....it worked with computer vision (CNNs), but who said it should apply to everything else.
No mention of “understanding” in the title, not even trying these days
guess what these machines will eventually 'plan' when they learn to 'learn, reason, and plan'
My question to Yann would have been "I'm an idiot. If I offered you a job here and now without me being able to give you practically anything you want in return, would you come and work for me as my super intelligence?"
Exactly! How naive are people in this field who trusts the version of the future where we successfully enslave an intelligence 100.000X smarter than us? How dumb can people possibly be??
Summary from ChatGPT :
Yann LeCun discusses the state of AI, the limitations of current machine learning systems, and the potential pathway to more intelligent machines. He highlights the success of self-supervised learning and generative text models, but also mentions their shortcomings in factual accuracy and reasoning.
Highlights
🤖 Self-supervised learning, using prediction to train a system to represent data, is widely successful in natural language understanding.
📝 Generative text models, trained on trillions of tokens and using auto-regressive prediction, can generate plausible but often factually inaccurate text.
🧠 To achieve more intelligent machines, we need to move beyond machine learning to systems that can reason, plan, and understand the world.
I am amazed how certain he is that safety can be encoded, and yet in the first answer he explains how the current theory was not "all knowing". Sounds very arrogant to me. Arrogant to almost ignorant, reckless
He is so naive, he truly believes that we can successfully enslave an intelligence 100.000X smarter than us. And yet he has no theoretical explanation for how he assumes we might be able to do so while beeing at the bleeding edge of AI development. People like him will accelerate the extinction of our species
Except he’s a Nobel prize winning AI scientist 😅
😂the revolution will not be supervised!
It's actually already happening...
Isn't Yan LeCun completely disgraced by now? How he has a job with Meta is amazing.
How he is disgraced?
The guy is a total clown.
@@albertodelrio5966 I heard he's being viciously attacked for his frenchspeak. Also all the LLM's he's part of building will give regular random interjections of "baguette" and "omelette du fromage"
@@albertodelrio5966anyone who doesn’t bow to the supreme gentooman and doomer chief Eliezer Yudkowsky is apparently fallen off and not even a real scientist
@@daarom3472he prefers jazz to rap. Quelle horreur!
I like LeCun, but he should take a tip from me: you are not going to get machine AGHI. Billions of years of evolution could not get a single other species like humans. We are (obviously) spiritual beings. You have to ask, as an ai scientist (not engineer, who have no clue) what good can the smartest animal get you? A. a lot, but limited scope. The future (i predict, fwiw, not much likely, no such prediction is worth much) is that highly specialised ai tools are the way to go. Humans will be enhanced with a bundle of such tools. Like cavemen became advanced with fire, ploughs and spears (weak analogy only, but I'd say about right). More tool good. Get rid old tool. More time play. Me want hack. More time sex.
Your tip is wrong. Evolution doesn't try to build intelligence, it's just certain genetic sequences getting reproduced slightly more each generation. People are explicitly building machines to be smart, and sharing most of what they learn with other researchers so that improvements spread on a monthly basis. The success is phenomenal.
I sense tremendous danger ahead for humanity. Ai crime and losing your job to robots ai agents and plug-ins is unacceptable. Ai jobloss is here. So are Ai as weapons. Can we please find a way to cease Ai / GPT? Or begin pausing Ai before it’s too late?
Yann's Hierarchical planning model is nonsense. Yann has fallen into the trap of believing he personally needs to understand and map the process himself rather than allow the AI system do it. The important feature of AI planning is that the AI needs to feedback on itself and evaluate its initial "thought", not have some comparison to another "world model".
I see the process as being much simpler.
Feed forward with the original question and obtain an answer.
Take that answer, the original question, and an instruction to analyse and improve on the answer producing a further answer.
Repeat using the improved answer until processing reaches some threshold.
Do a number of these in parallel.
Then pass the question and the answers into the model with the instruction to evaluate them and choose the best one
There isn't even a need to re-tokenise, use the vectors directly on subsequent passes.
This architecture should also work for questions that naturally need a number of steps to evaluate like a fair bit of mathematics.
We know that you can refine an answer by having a conversation and this possible architecture automates that to some extent.
They may already be doing something similar to this is more recent models.
We're waiting for your paper...
@Franky Vincent The whole argument is based on the fact that you can ask a LLM such as ChatGPT to improve on its answer, and it almost always does. Demonstrably. Feed forward once isn't optimal.
Let's try your approach. Prompt: solve nuclear fusion. Answer: it's a very difficult problem, I can't provide a solution. Prompt: improve your answer. Answer: it's very, very difficult, I can't provide an answer. Prompt: improve your answer. Answer: it's very, very, very difficult, I can't provide an answer.................
Out of metaphor, if your basic neural net is incapable of thinking, it will at each step provide a shallow idea. So when it comes to significant problems that require innovative thinking, GPT will go through an infinite loop.
@federicoaschieri They're not actual responses, are they. Like all questions, if the information doesn’t exist to formulate an answer, then an answer can't be formulated. Experimentation and acquiring new knowledge is the best path forward for the time being.
@@medhurstt For every interesting open problem out there we don't have the information to solve it straightforwardly. That's why your idea is not very useful, and one has to build sophisticated cognitive architectures, like LeCun explained. Many years of research are still needed to figure them out.
Yann LeClown 🤡
Why?
It's astonishing how people have the audacity to say that a very epensive and overhyped web scraper's can lead to AGI , In what world ? roaches or ants ? Maybe but not in the real human world, At least not the Intelligence we referre to in the text books, No matter how many paid lunatics like Eliezer or others like him tries to create hypes and scarcity so more people can use them and companies can make large profits.