Ian Goodfellow: Adversarial Machine Learning (ICLR 2019 invited talk)

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Jonathan Frankle - The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks

revealing the truth...

Making Cookies For Santa

Vermont vs. Marshall: 2024 NCAA men’s soccer championship highlights

J. Frankle & M. Carbin: The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Steven Van Vaerenbergh

Просмотров 18 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 15 янв 2025

Комментарии • 20

@Kram1032 5 лет назад ⁺⁶
Really interesting. Whether overfitting is involved definitely matters, but if it turns out that that isn't the case - or at least that it doesn't *have* to be the case (i.e. you can get over that hurdle), this might lead to really nice compact networks.
From what I recall, there has recently also been work on doing this but with a growth step added in, so layers can also become bigger if it turns out that that is helpful. In that case the savings aren't quite so dramatic but presumably (I'm not sure that's quite right?) the benefit is even higher accuracy. While *still* shrinking the network overall.
@tdk99-i8n 4 года назад
Could you expand on how a hurdle of overfitting pruned networks can be knowingly overcome? Also any news on this topic you've come across in the year since this comment?
@yensteel 3 года назад
@@tdk99-i8n do you think a pareto front can be generated? Would it necessarily be weights vs accuracy with the different networks generated?
@shourjaganguly2338 3 года назад
I am a complete novice in this area but I feel the fact that they are getting accuracy metrics on test dataset, means that the we can get an idea about the overfitting. Obviously the training dataset gives a good idea about the underfitting. please do correct me if I am wrong any way. I am a learner here. :)
@siddharthagrawal8300 2 года назад
the thing is, pruning existed before this paper and it showed good accuracy, so idk how this paper is revolutionary. why would anyone want to retrain an already fully trained pruned network, instead of just finetuning it and save time on training?
It's not like you know the sparse weight initialisations beforehand, which is what this presentation makes it seem. u would still need to train the fully connected larger network, and this really doesn't help much for any practical purposes (unless it somehow beats the accuracy of finetuning)
seems like more hype than actual worth. or maybe im getting something wrong.
@YashSharma-xu1bj Год назад
@@siddharthagrawal8300 you are right, the value in the work is based on the assumption that adding to scientific understanding facilitates future practical improvements. I haven't kept up with the literature since this paper to know whether that assumption held true in this case.
@aydoooo 5 лет назад
I'm not sure whether I'm getting this right and I would really like some elaboration. Under the assumption that the function that is intended to be learned is actually far less complex than the network architecture used for training, of course there is going to be that one sub-network that is going to perform the best when looked at in isolation, simply because of random initialization. If my function to be learned can be modeled optimally with a single weight but I use 10, random initialization will make one weight learn 'the fastest/best'. So isn't the only question here how to find this weight, i.e. the pruning strategy? Or is my assumption messy?
@JungleEd17 4 года назад ⁺¹
I think you have it right. I think the idea that you can prune early is interesting, but if you have enough computer power to buy all the lottery tickets, why not buy all of them. Perhaps the more subtle things will become the
I think question "[will] random initialization make one weight learn 'the fastest/best" should still be explored. Is random really the best? What if all weights for a neuron (either all input or all output) end up positive.
Perhaps the more interesting things is that its not the number of nodes or connection that are bad. It's the initialization. Why throw out the connection with the weight? Couldn't we throw out the weight and then reinitialize it?
@steveforbin911 5 лет назад ⁺¹
Bottom line here have you been able to predict any winning numbers? Consistently? I would assume your program scores your predicted numbers against known winning numbers and tries to improve that score. Can you give us any info regarding how well the system learns and the training required.
@robvdm 4 года назад ⁺²
Theres work going in this direction: arxiv.org/abs/1909.11957
@steveforbin911 4 года назад ⁺¹
@@robvdm Thanks keep up the good work.
@hangchen 2 года назад ⁺⁴
13:53 老弟真的没礼貌
@justinking5964 2 года назад
Hello do you like to play lottery pick 3?
@hangchen 2 года назад
@@justinking5964 Never played it. Why?
@justinking5964 2 года назад
@@hangchen I assumed you are not American. Just ask randomly.
@hangchen 2 года назад
@@justinking5964 Lol yea your assumption is correct. Do most Americans play lottery pick 3?
@forecastinglottery6153 5 лет назад ⁺³
very beautiful and convincing but no result
Because in the lottery it’s important to guess the numbers and the date of the event; here it’s just brute force
@fabianschimpf2686 3 года назад
This statement doesn't take into account that this is a first step which enables work that spins off the presented idea. One can argue that this could be considered to be a result as well. Notable mentions to your point might be:
a) subnetworks generalise to similar tasks and can act as an initialisation scheme, see Morcos et al.
b) there was recent work on how to find these subnetworks more efficiently than brute force, see You et al. and Tanaka et al.
I acknowledge that depending on when this comment was made this work might not have been available yet.

Следующие

Автовоспроизведение

Ian Goodfellow: Adversarial Machine Learning (ICLR 2019 invited talk)

Ian Goodfellow: Adversarial Machine Learning (ICLR 2019 invited talk)

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Jonathan Frankle - The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks

Jonathan Frankle — The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks

revealing the truth...

revealing the truth...

Making Cookies For Santa

Making Cookies For Santa

Vermont vs. Marshall: 2024 NCAA men’s soccer championship highlights

Vermont vs. Marshall: 2024 NCAA men’s soccer championship highlights

"It's time for him to leave" | Jamie Carragher says Marcus Rashford should leave Man Utd

"It's time for him to leave" | Jamie Carragher says Marcus Rashford should leave Man Utd

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

How to Remember Everything You Read

How to Remember Everything You Read

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

What is the Lottery Ticket Hypothesis, and why is it important?

What is the Lottery Ticket Hypothesis, and why is it important?

Ilya Sutskever: "Sequence to sequence learning with neural networks: what a decade"

Ilya Sutskever: "Sequence to sequence learning with neural networks: what a decade"

Transformer Neural Networks Derived from Scratch

Transformer Neural Networks Derived from Scratch

The Lottery Ticket Hypothesis and pruning in PyTorch

The Lottery Ticket Hypothesis and pruning in PyTorch

What is Sparsity?

What is Sparsity?

Can We Build an Artificial Hippocampus?

Can We Build an Artificial Hippocampus?

14 января наденьте это наизнанку и вы многое поменяете.

14 января наденьте это наизнанку и вы многое поменяете.

ПОЦЕЛУЙ НА СЪЕМКАХ КЛИПА ДЭНС ДЭНС? #янгер #shorts

ПОЦЕЛУЙ НА СЪЕМКАХ КЛИПА ДЭНС ДЭНС? #янгер #shorts

Талантливый внук Пугачёвой - Никита Пресняков 🔥

Талантливый внук Пугачёвой - Никита Пресняков 🔥

Провальная Акция pepsi

Провальная Акция pepsi

Marvel Television's Daredevil: Born Again | Official Trailer | Disney+

Marvel Television's Daredevil: Born Again | Official Trailer | Disney+

как видит учитель vs что происходит на самом деле ( устное дз )

как видит учитель vs что происходит на самом деле ( устное дз )

this cutscene is too long 😭#roblox #tsb #thestrongestbattlegrounds #kj

this cutscene is too long 😭#roblox #tsb #thestrongestbattlegrounds #kj

Шоколадный бизнес

Шоколадный бизнес