Implement and Train ViT From Scratch for Image Recognition - PyTorch

Has Generative AI Already Peaked? - Computerphile

Function Calling with Local Models & LangChain - Ollama, Llama3 & Phi-3

Lil Baby - 5AM (Official Video)

Implement Llama 3 From Scratch - PyTorch

Uygar Kurt

Просмотров 2,2 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 15 ноя 2024

Комментарии • 27

@nbvcxz098 Месяц назад ⁺⁴
WOW! You are something else dude! No one provides content like you! Exceptional!
@uygarkurtai Месяц назад
Thank you!
@Ece-kx6qk Месяц назад
The Video that I have been waiting for !!! Thank you 🙏🏻
@uygarkurtai Месяц назад
Thank you!
@binyiu5353 Месяц назад
Many thanks for this! It gives a much better understanding before reading the paper.
@uygarkurtai Месяц назад
I'm glad to hear that!
@aykutcayir64 Месяц назад
As always, great job!👏🏻
@uygarkurtai Месяц назад
@@aykutcayir64 thank you!
@learntestenglish Месяц назад
I was waiting for new video. Thanks for awesome work ❤😊
@uygarkurtai Месяц назад ⁺¹
Thank you!
@කැලණිකුප්පි Месяц назад ⁺²
Woowwww awesome thanks for this ❤❤
@uygarkurtai Месяц назад
Thank you!
@gustavojuantorena Месяц назад
Awesome!
@uygarkurtai Месяц назад
Thank you!
@abhijoy.sarkar Месяц назад
Let’s make llama4 before llama4 🤝
@uygarkurtai Месяц назад
With enough gpus 🤝
@flashlin1 Месяц назад ⁺¹
Data, algorithms, and computational power are the three key elements. Why hasn't anyone added more complex connection models to Transformers? We should consider increasing the algorithmic complexity of large language models (LLMs), which can be likened to the complexity of connections in the human brain. This way, we wouldn't need to endlessly increase the number of parameters, especially since the number of artificial neurons already exceeds that of human neurons. Moreover, we haven't seen designs similar to the short-term memory neuron models from the runtime period.
We should aim to design a model that can, like humans, quickly read relevant articles when faced with a problem. During the reading process, it could summarize related content into short-term memory and continuously update it. Then, based on this short-term memory, the model could verify the correctness of answers, for instance, by writing code to check the answers. Wouldn't this approach allow us to make the model smaller?
@uygarkurtai Месяц назад
It's a very good research question. Attention mechanism can be viewed like the "short-term" memory you mentioned too. I remember some articles to make NN's like human brain sinapses. However the problem is that they didn't perform that well.
@flashlin1 Месяц назад
@@uygarkurtai The variety of neurons in the human brain far exceeds the range of functions used in artificial neural networks. How can we expect a single model, like the transformer, to handle everything? Shouldn't we focus on designing more diverse neural functions to better reflect the complexity of the brain?
@uygarkurtai Месяц назад
@@flashlin1 in that case we again end up with a computationally expensive model. There's such a trade-off that is difficult to overcome. You may want to check multi-models that's closest to what you mention. Combination of several models. If you're curious about mimicking the brain also check out spiking neural networks.
@flashlin1 Месяц назад
@@uygarkurtai Why haven't we seen much progress with Spiking Neural Networks? My ideal concept of short-term memory should function during the inference phase, not be fixed during the training phase. Specifically, as the model processes an input question or reads through a large volume of articles, it should be able to summarize and store useful and relevant information in short-term memory, and only then generate an answer based on that.
Moreover, during the process of generating an answer, the model should be able to dynamically update the short-term memory. For example, if later predictions impact the earlier generated content, the model should revise the previous answers based on the new information before producing the final result.
Is there any model that works like this?
@uygarkurtai Месяц назад
@@flashlin1 we haven't seem them because usually there're points where they fall short compared to regular MLPs. To me what you mentioned seems a bit like RAG applications.
@en-iyi-benim 26 дней назад
Hi! I really enjoy your videos and the way you explain concepts. I recently implemented the Qwen-2 Vision model using pure PyTorch. There’s a small error I’m working through at the moment, but I’d love to know if you’d be open to making a video using my code to explain the process. I think it could be really helpful for others who are interested in vision language models. Let me know what you think
@uygarkurtai 26 дней назад
@@en-iyi-benim hey thank you! I may look Qwen-2 model in the future. You can share your repository here too when it's done

Следующие

Автовоспроизведение

Implement and Train ViT From Scratch for Image Recognition - PyTorch

Implement and Train ViT From Scratch for Image Recognition - PyTorch

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

Function Calling with Local Models & LangChain - Ollama, Llama3 & Phi-3

Function Calling with Local Models & LangChain - Ollama, Llama3 & Phi-3

Lil Baby - 5AM (Official Video)

Lil Baby - 5AM (Official Video)

진 (Jin) 'Running Wild' Official MV

진 (Jin) 'Running Wild' Official MV

Future Digital Energy Showcases latest Innovations at 26th Power Bangladesh Expo

Future Digital Energy Showcases latest Innovations at 26th Power Bangladesh Expo

LoRA: Low-Rank Adaptation of Large Language Models - Explained visually + PyTorch code from scratch

LoRA: Low-Rank Adaptation of Large Language Models - Explained visually + PyTorch code from scratch

LLM inference optimization: Architecture, KV cache and Flash attention

LLM inference optimization: Architecture, KV cache and Flash attention

NEW TextGrad by Stanford: Better than DSPy

NEW TextGrad by Stanford: Better than DSPy

Train Diffusion Model For Image Generation | PyTorch, Diffusers, Custom Dataset

Train Diffusion Model For Image Generation | PyTorch, Diffusers, Custom Dataset

Llama 3.2 goes Multimodal and to the Edge

Llama 3.2 goes Multimodal and to the Edge

Pytorch Transformers from Scratch (Attention is all you need)

Pytorch Transformers from Scratch (Attention is all you need)

Build Contextual Retrieval with Anthropic and Pinecone

Build Contextual Retrieval with Anthropic and Pinecone

CUDA Mode Keynote | Andrej Karpathy | Eureka Labs

CUDA Mode Keynote | Andrej Karpathy | Eureka Labs

Все грехи и ляпы мультфильма "Головоломка 2"

Все грехи и ляпы мультфильма "Головоломка 2"

Мечта каждого ресторана! Хоспер!

Мечта каждого ресторана! Хоспер!

Побег из Тюрьмы : Тетрис помог Nuggets Gegagedigedagedago сбежать от Nikocado Avocado !

Побег из Тюрьмы : Тетрис помог Nuggets Gegagedigedagedago сбежать от Nikocado Avocado !

Выжимаем максимум на МТЗ-82! Это уже страшно...

Выжимаем максимум на МТЗ-82! Это уже страшно...

ДЕЛО О ТРУСАХ В КУКУРУЗНЫХ ПАЛОЧКАХ

ДЕЛО О ТРУСАХ В КУКУРУЗНЫХ ПАЛОЧКАХ

Вася и Ваня ❤️🥰

Вася и Ваня ❤️🥰

БОРЗЫЙ МЕНТ БЫКУЕТ, ЗАПРЕЩАЕТ СНИМАТЬ, ПЫТАЕТСЯ УВЕЗТИ В ОТДЕЛ И ПРЯЧЕТСЯ ОТ НАС. ПРИЕХАЛА КРЫША? 2Ч

БОРЗЫЙ МЕНТ БЫКУЕТ, ЗАПРЕЩАЕТ СНИМАТЬ, ПЫТАЕТСЯ УВЕЗТИ В ОТДЕЛ И ПРЯЧЕТСЯ ОТ НАС. ПРИЕХАЛА КРЫША? 2Ч

УЛУЧШИЛ ВСЕ СКИЛЫ В ДОТЕ (зря)

УЛУЧШИЛ ВСЕ СКИЛЫ В ДОТЕ (зря)