Data, algorithms, and computational power are the three key elements. Why hasn't anyone added more complex connection models to Transformers? We should consider increasing the algorithmic complexity of large language models (LLMs), which can be likened to the complexity of connections in the human brain. This way, we wouldn't need to endlessly increase the number of parameters, especially since the number of artificial neurons already exceeds that of human neurons. Moreover, we haven't seen designs similar to the short-term memory neuron models from the runtime period. We should aim to design a model that can, like humans, quickly read relevant articles when faced with a problem. During the reading process, it could summarize related content into short-term memory and continuously update it. Then, based on this short-term memory, the model could verify the correctness of answers, for instance, by writing code to check the answers. Wouldn't this approach allow us to make the model smaller?
It's a very good research question. Attention mechanism can be viewed like the "short-term" memory you mentioned too. I remember some articles to make NN's like human brain sinapses. However the problem is that they didn't perform that well.
@@uygarkurtai The variety of neurons in the human brain far exceeds the range of functions used in artificial neural networks. How can we expect a single model, like the transformer, to handle everything? Shouldn't we focus on designing more diverse neural functions to better reflect the complexity of the brain?
@@flashlin1 in that case we again end up with a computationally expensive model. There's such a trade-off that is difficult to overcome. You may want to check multi-models that's closest to what you mention. Combination of several models. If you're curious about mimicking the brain also check out spiking neural networks.
@@uygarkurtai Why haven't we seen much progress with Spiking Neural Networks? My ideal concept of short-term memory should function during the inference phase, not be fixed during the training phase. Specifically, as the model processes an input question or reads through a large volume of articles, it should be able to summarize and store useful and relevant information in short-term memory, and only then generate an answer based on that. Moreover, during the process of generating an answer, the model should be able to dynamically update the short-term memory. For example, if later predictions impact the earlier generated content, the model should revise the previous answers based on the new information before producing the final result. Is there any model that works like this?
@@flashlin1 we haven't seem them because usually there're points where they fall short compared to regular MLPs. To me what you mentioned seems a bit like RAG applications.
Hi! I really enjoy your videos and the way you explain concepts. I recently implemented the Qwen-2 Vision model using pure PyTorch. There’s a small error I’m working through at the moment, but I’d love to know if you’d be open to making a video using my code to explain the process. I think it could be really helpful for others who are interested in vision language models. Let me know what you think
WOW! You are something else dude! No one provides content like you! Exceptional!
Thank you!
The Video that I have been waiting for !!! Thank you 🙏🏻
Thank you!
Many thanks for this! It gives a much better understanding before reading the paper.
I'm glad to hear that!
As always, great job!👏🏻
@@aykutcayir64 thank you!
I was waiting for new video. Thanks for awesome work ❤😊
Thank you!
Woowwww awesome thanks for this ❤❤
Thank you!
Awesome!
Thank you!
Let’s make llama4 before llama4 🤝
With enough gpus 🤝
Data, algorithms, and computational power are the three key elements. Why hasn't anyone added more complex connection models to Transformers? We should consider increasing the algorithmic complexity of large language models (LLMs), which can be likened to the complexity of connections in the human brain. This way, we wouldn't need to endlessly increase the number of parameters, especially since the number of artificial neurons already exceeds that of human neurons. Moreover, we haven't seen designs similar to the short-term memory neuron models from the runtime period.
We should aim to design a model that can, like humans, quickly read relevant articles when faced with a problem. During the reading process, it could summarize related content into short-term memory and continuously update it. Then, based on this short-term memory, the model could verify the correctness of answers, for instance, by writing code to check the answers. Wouldn't this approach allow us to make the model smaller?
It's a very good research question. Attention mechanism can be viewed like the "short-term" memory you mentioned too. I remember some articles to make NN's like human brain sinapses. However the problem is that they didn't perform that well.
@@uygarkurtai The variety of neurons in the human brain far exceeds the range of functions used in artificial neural networks. How can we expect a single model, like the transformer, to handle everything? Shouldn't we focus on designing more diverse neural functions to better reflect the complexity of the brain?
@@flashlin1 in that case we again end up with a computationally expensive model. There's such a trade-off that is difficult to overcome. You may want to check multi-models that's closest to what you mention. Combination of several models. If you're curious about mimicking the brain also check out spiking neural networks.
@@uygarkurtai Why haven't we seen much progress with Spiking Neural Networks? My ideal concept of short-term memory should function during the inference phase, not be fixed during the training phase. Specifically, as the model processes an input question or reads through a large volume of articles, it should be able to summarize and store useful and relevant information in short-term memory, and only then generate an answer based on that.
Moreover, during the process of generating an answer, the model should be able to dynamically update the short-term memory. For example, if later predictions impact the earlier generated content, the model should revise the previous answers based on the new information before producing the final result.
Is there any model that works like this?
@@flashlin1 we haven't seem them because usually there're points where they fall short compared to regular MLPs. To me what you mentioned seems a bit like RAG applications.
Hi! I really enjoy your videos and the way you explain concepts. I recently implemented the Qwen-2 Vision model using pure PyTorch. There’s a small error I’m working through at the moment, but I’d love to know if you’d be open to making a video using my code to explain the process. I think it could be really helpful for others who are interested in vision language models. Let me know what you think
@@en-iyi-benim hey thank you! I may look Qwen-2 model in the future. You can share your repository here too when it's done