Stride - Convolution in Neural Networks

Kernel Size and Why Everyone Loves 3x3 - Neural Network Convolution

Bjarne Stroustrup: C++ | Lex Fridman Podcast #48

We Won Every Game at the State Fair!

Young Thug Gives Heartfelt Speech Before Walking Out of YSL Trial

Digging Up a Mystery Egg That I Buried in My Giant Rainforest Vivarium

Pixel Shuffle - Changing Resolution with Style

Animated AI

Просмотров 8 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 3 ноя 2024

Комментарии • 31

@salmiac-3105 Год назад ⁺³⁰
would've loved an example image for the pixel suffle too there to really grasp what is happening
@ziggycross Год назад ⁺⁷
Was just about to leave a comment to say this! Was waiting for some example images, would be great to keep in mind for future videos!
@ELjoakoDELxD Год назад ⁺¹
@@ziggycross I wanted some images too. I didn't understand fully what the output is going to be with pixelshuffle
Edit: grammar will always be difficult for me
@djmips Год назад ⁺²
It's not working on actual pixels. The 'depth' or input to the shuffle is the feature maps generated from the low res image and it's at this last stage that the image is upsampled. This is in contrast to older methods that would upsample the image straight away and then try and process that into the super resolution output which was both less efficient and potentially introduced the artifacts mentioned in the video. For more information see the paper referenced in the video. "Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network" by Shi et al.
@johnpope1473 7 месяцев назад
@@djmips - now I understand. thanks
@VFXVideoeffectsCreator Год назад ⁺⁹
I have to say, that's really awesome! Especially the hint that transposed convolution is just the gradient computation of convolution w.r.t. its inputs. I regularly contribute to the backends of Deep Learning Frameworks in the Julia Programming Language, and transposed convolution (or deconvolution, or some freaky way to say it: fractionally strided convolution) is really just a function call to the function calculating the adjoint (gradient) of a normal convolution (except output_padding, but this just affects the size calculation anyway).
@chrisminnoy3637 Год назад ⁺³
Thanks. Was already using that for quite some time in my super resolution upscaler. Downside of the tensorflow implementation, as far as I know, you can only use squares, but it would make sense to also just do it in one dimension, or more in a rectangle. Some work to be done there ...
@kevalan1042 Год назад ⁺²
Beautiful work as always
@I77AGIC 5 месяцев назад
this made it make a ton of sense. but one problem is pixel shuffle does not get rid of the artifacts. it introduces its own artifacts
@Firestorm-tq7fy 6 месяцев назад
One of the best channels! I wish u‘d be covering more topics than only CNN, but guess can’t be a top pro in every topic. I def subbed and wished u‘d have way more videos already. But i can see that it takes alot of time and effort so i will wait. Thank u so much for this work ❤
@Biuret. Год назад ⁺²
Great content. Thank you!
@ScottzPlaylists Год назад ⁺²
👍 you make awesome illustrations.. ❤ Can you explain Transformers encoding and inference? ❓
That would be a big hit also. 👏
@MarvinEckhardt 11 месяцев назад ⁺¹
this video should have way more likes...
@HinaTan250 Год назад ⁺²
This is really cool! 😄 Thanks for the information.
@AdmMusicc 5 месяцев назад
Loved the animation thank you!!
@j________k Год назад ⁺²
Great series! Keep it up :)
@SrDlay Год назад
thanks for your effort
@chrisminnoy3637 Год назад
Would be nice to have a video about TensorTrain technique
@coryfan5872 7 месяцев назад
Hi, isn't this virtually the same effect as a stride 2, 2x2 transpose convolution with the output channel just being 4 times smaller? Its a convolutional filter with some binary weights that causes each pixel channel to be mapped to some new channel. The aforementioned transpose convolution would be the same if you just had a linear layer before the pixel shuffle.
@Erosis Год назад ⁺¹
Do you have a paper or resource about the artifacts in the gradient when using strided 3x3 convs?
@animatedai Год назад ⁺³
If you accept that transposed convolution (kernel size=3, stride=2) produces gridding artifacts in the output image then by definition, standard convolution (kernel size=3, stride=2) produces gridding artifacts in the input image gradient. The reason is that transposed convolution is implemented as a literal call to the gradient function of standard convolution in TensorFlow and PyTorch.
I learned this at some point studying the papers and code of the StyleGAN saga. (nvlabs.github.io/stylegan2/versions.html) I wish I could narrow it down more for you, if you're trying to cite this. I have a feeling I learned it from reading their code or one of their references. You'll notice in all the versions of their code, they go out of their way to implement downsampling as a blur -> convolution rather than just a plain strided convolution. StyleGAN3 is all about aliasing.
@coryfan5872 7 месяцев назад ⁺¹
Its probably because some pixels overlap the convolutional filter only once (the ones in the centers), some pixels overlap the convolutional filter 2 times (the ones on the sides but not the corners), and some pixels overlap the convolutional filter 4 times (the ones in the corners). I wonder if using ConvNext's 2x2 convolutional layers still results in this sort of gradient artifacts.
@azatahmedov4308 8 месяцев назад
Can you explain how will you pixel_unshuffle, if resolution is 4000x3000 (WxH) and downscale_factor is 16?
@fqidz Год назад
theres zero explanation about how this would work with real images
@djmips Год назад
It's not working on actual pixels. The 'depth' or input to the shuffle is the feature maps generated from the low res image and it's at this last stage that the image is upsampled. This is in contrast to older methods that would upsample the image straight away and then try and process that into the super resolution output which was both less efficient and potentially introduced the artifacts mentioned in the video. For more information see the paper referenced in the video. "Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network" by Shi et al.
@yinwong667 7 месяцев назад
But why is it necessary to do pixel shuffle? Why can't we just output a rH x rW x 3 matric directly?
@周胜辉 6 месяцев назад
有点想到了亚像素插值
@ucngominh3354 Год назад
hi
@krum3155 Год назад ⁺²
jif
@grugiv Год назад
yes
@muthukamalan.m6316 Год назад
super cool. waiting for Transformers and BN,LN

Следующие

Автовоспроизведение

Stride - Convolution in Neural Networks

Stride - Convolution in Neural Networks

Kernel Size and Why Everyone Loves 3x3 - Neural Network Convolution

Kernel Size and Why Everyone Loves 3x3 - Neural Network Convolution

Bjarne Stroustrup: C++ | Lex Fridman Podcast #48

Bjarne Stroustrup: C++ | Lex Fridman Podcast #48

We Won Every Game at the State Fair!

We Won Every Game at the State Fair!

Young Thug Gives Heartfelt Speech Before Walking Out of YSL Trial

Young Thug Gives Heartfelt Speech Before Walking Out of YSL Trial

Digging Up a Mystery Egg That I Buried in My Giant Rainforest Vivarium

Digging Up a Mystery Egg That I Buried in My Giant Rainforest Vivarium

SIDEMEN AMONG US BUT THE WHOLE LOBBIES INFECTED

SIDEMEN AMONG US BUT THE WHOLE LOBBIES INFECTED

What is Sub-Pixel Animation?

What is Sub-Pixel Animation?

PyTorch Conv2d Explained

PyTorch Conv2d Explained

[Tutorial] Realtime Draw-like Tracing Technique - TouchDesigner

[Tutorial] Realtime Draw-like Tracing Technique - TouchDesigner

Convolution Padding - Neural Networks

Convolution Padding - Neural Networks

The Unity Tutorial For Complete Beginners

The Unity Tutorial For Complete Beginners

Learn how to solve a Rubik’s cube in 1 minute training day 10

Learn how to solve a Rubik’s cube in 1 minute training day 10

Groups, Depthwise, and Depthwise-Separable Convolution (Neural Networks)

Groups, Depthwise, and Depthwise-Separable Convolution (Neural Networks)

What is Pixel? - How Computer Understands an Image?

What is Pixel? - How Computer Understands an Image?

CompTIA Network+ Certification Video Course

CompTIA Network+ Certification Video Course

Now you won't have problems💡🧼#camping #survival #bushcraft #outdoors #lifehack

Now you won't have problems💡🧼#camping #survival #bushcraft #outdoors #lifehack

Нарвался НА ПРОФЕССИОНАЛЬНОГО БОКСЁРА #мма

Нарвался НА ПРОФЕССИОНАЛЬНОГО БОКСЁРА #мма

Rate our flexibility 1-10😳👯‍♀️🔥💗

Rate our flexibility 1-10😳👯‍♀️🔥💗

Челендж "Сложно проиграть"

Челендж "Сложно проиграть"

🤣 Придумал как ничего не делать и получать зарплату, но начальство всё узнало! | Новостничок

🤣 Придумал как ничего не делать и получать зарплату, но начальство всё узнало! | Новостничок

How to sign the letter A?❤️

How to sign the letter A?❤️

Лукашенко: Да не примут грузины ЛГБТ! #лукашенко #новости #политика #беларусь #грузия #shorts

Лукашенко: Да не примут грузины ЛГБТ! #лукашенко #новости #политика #беларусь #грузия #shorts

😮 Прикол с динозавром пошёл не по плану! | Новостничок

😮 Прикол с динозавром пошёл не по плану! | Новостничок