CNN: Convolutional Neural Networks Explained - Computerphile

Поделиться
HTML-код
  • Опубликовано: 21 сен 2024
  • Years of work down the drain, the convolutional neural network is a step change in image classification accuracy. Image Analyst Dr Mike Pound explains what it does.
    Kernel Convolutions: • How Blurs & Filters Wo...
    Deep Learning: • Deep Learning - Comput...
    Botnets: • Botnets - Computerphile
    AI's Game Playing Challenge: • AI's Game Playing Chal...
    Space Carving: • Space Carving - Comput...
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscom...
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Комментарии • 508

  • @dickhamilton3517
    @dickhamilton3517 8 лет назад +72

    I like the 'extensive' library of books he has on that shelf above and behind him

    • @techsuvara
      @techsuvara 3 месяца назад +1

      Word Press Forms is all you need!

  • @Wardropulous
    @Wardropulous 8 лет назад +55

    This guy made all that really easy to follow. I admire his ability to explain such complicated things. He's really good at identifying and skipping over the irrelevant stuff, and focusing on the core problem/solution.

  • @Intelligenz_Bestie
    @Intelligenz_Bestie 8 лет назад +149

    convolutional neural networks is one of those things that really needs some visuals, i find that it is really hard to 'grok' when you get it explained in a book or via speech but once you get a visual example it's kind of hilariously simple and scaringly plausible

  • @alcesmir
    @alcesmir 8 лет назад +753

    Neat, I just did my bachelor thesis on convolutional neural networks. We built and trained a sign language interpreter that worked pretty well. I can affirm that neural networks are equal parts wisdom and witchcraft.

    • @riccardoorlando2262
      @riccardoorlando2262 8 лет назад +84

      +Alcesmire Aha! Now that I know neural networks exist, I can start my Mathematics thesis with:
      "Let N be a neural network..."

    • @IceMetalPunk
      @IceMetalPunk 8 лет назад +7

      +Alcesmire That's actually pretty awesome! It could seriously improve the lives of deaf people, especially seeing as how we're moving more and more into the whole voice-controlled, NLP virtual assistant world.

    • @brcha
      @brcha 8 лет назад +9

      Frankly, I don't see the wisdom part. Sure, when you design a NN, you do have to scale it correctly to the problem and so on, but once you've got everything setup, the rest is magic.

    • @verynicehuman
      @verynicehuman 8 лет назад +8

      Is'nt Nueral networks just math? Ive studied the backpropagation algo,stacked neural networks,etc..etc and the thing that struck me is that its all just math that you learn in an engineering course especially stats and linear algebra to solve equations. Why did you say its "witchcraft"?

    • @brcha
      @brcha 8 лет назад +42

      Sreekar Nimbalkar
      Because NNs are large sets of linear or non-linear equations with dynamically generated coefficients that mean absolutely nothing to the designers of NNs, but somehow work. Hence witchcraft.
      I mean, I could basically simulate a NN on paper and it would still work, while I would still have no idea why it works.

  • @IceMetalPunk
    @IceMetalPunk 8 лет назад +36

    Not long ago, I read about a machine learning system that was able to classify planes, trees, and people in nearly live video, all without ever having any hard-coded feature sets. The math was way over my head (despite being a computer scientist, specialized areas can still stump me at times). Now I look back at it, and it was in fact a CNN being used! This was a few years ago now, but if they just started becoming popular in 2012, that makes sense.
    Thank you for the higher-level explanation that allows me to understand it after all this time XD

  • @shariarpapaon5305
    @shariarpapaon5305 Год назад +4

    i love watching mike out of all the other ppl on this channel. this man just sounds right

  • @dries2965
    @dries2965 8 лет назад +11

    I could`t have explained it better, given the limitations of a youtube video. Well done computerphile!

  • @Gutagi
    @Gutagi 3 года назад +7

    For years down the line and look where we are now!
    What a time to be alive!

  • @badshabz1
    @badshabz1 Год назад +3

    This is by far the best video I've watched on CNNs and I've watched 4 others. It really describes the back propagation and image compression to a single dimension.

  • @ninjamaster224
    @ninjamaster224 8 лет назад +208

    "...check whether the photo is of a bird."
    "give me a research team and five years"

    • @baconology
      @baconology 7 лет назад +2

      YES!!!!!!!!! GRANT MONEY!!! PAY ME! This is an ACADEMIC!!!!!!!!!!! $$$$$$$$$$$$$$

    • @baconology
      @baconology 7 лет назад

      this is the best comment i've ever read. I feel you bro. Lived it!!!!!!!

    • @KnakuanaRka
      @KnakuanaRka 5 лет назад +2

      SaltyBrains I don’t get how that changes the meaning at all.

    • @KnakuanaRka
      @KnakuanaRka 5 лет назад +4

      Another xkcd fan?

    • @NareshTur
      @NareshTur 4 года назад +5

      @@KnakuanaRka That changes meaning because a computer did it. Now you can automate that process and probably hundreds such processes. Automated object detection can be used in a multitude of processes and industries. Google it and surprise yourself.

  • @harleyspeedthrust4013
    @harleyspeedthrust4013 4 года назад +11

    I wrote a neural network framework in Java that allows you to build neural networks with arbitrary shapes and structures. You can chain layers together and as long as you implement forward and backward functions, your layer will work. I implemented a lot of layers (fully connected, convolutional, pooling, etc.)
    And yes it's Java but the process was a valuable learning experience, much better for me than learning keras or something without knowing how it works

    • @IgorRoztr
      @IgorRoztr Год назад

      The best way of learning is building stuff that you want to understand😉

    • @harleyspeedthrust4013
      @harleyspeedthrust4013 Год назад

      @@IgorRoztr absolutely agree. i wouldn't say i understand something unless i've built something like it

    • @ItsMine-fd3lq
      @ItsMine-fd3lq 9 месяцев назад

      Hey... I tried to implement a cnn from scratch using python... But it is not working properly... I found few issues but not the solution for it.. I searched abt it in every website but couldn't find proper solution.. can u clarify my doubts, pls?

    • @ItsMine-fd3lq
      @ItsMine-fd3lq 9 месяцев назад

      I jst want to clarify whether whatever ik is right or wrong and want to know the solutions for the issues

  • @NardiPaffon
    @NardiPaffon 6 месяцев назад +3

    I guess the debate he mentions over whether neural networks will change everything, is settled now in 2024?

  • @chebkhaled1985
    @chebkhaled1985 8 лет назад +6

    Couldn't but to observe WPF C# book , nice to see another one specialised in these two things

  • @WillNewton10
    @WillNewton10 Год назад +3

    So happy for Frodo Baggins and his new career as AI teacher

  • @bobiboulon
    @bobiboulon 7 лет назад +1

    I just discovered this channel, saw a bunch viideos and didn't come across a single boring one.

  • @overdrivegain
    @overdrivegain 3 года назад +1

    I don't know why, but Dr Mike explains things so nice and clear. Thanks!

  • @rockrollinnolan8521
    @rockrollinnolan8521 7 лет назад +29

    Dang kernel convolutions.
    My least favorite thing to happen when I'm making popcorn.

  • @mohammadmousavi1
    @mohammadmousavi1 3 года назад

    Still after 4 years, this is the best explanation of CNN on youtube ...

  • @seasong7655
    @seasong7655 2 года назад +1

    Watching this again really helped me improve my network. Thanks

  • @luffyorama
    @luffyorama 8 лет назад +1

    My friend had some project with ANN. And I have some project with image analysis. I never knew both can be linked with this technique! This might help me in my research!
    Thanks Sean, thanks Dr Mike!

  • @spaminbox
    @spaminbox 8 лет назад +188

    this is all rather convoluted.

    • @Wowthatsfail
      @Wowthatsfail 7 лет назад +1

      fidelio your profile pic wins the internet for the day!

    • @AkshayAradhya
      @AkshayAradhya 7 лет назад +5

      I guess if you applied the sharpen kernel it would make things more clearer

    • @baconology
      @baconology 7 лет назад +1

      This is hilarious.

    • @paulkossey7543
      @paulkossey7543 4 года назад

      Google where is A visible date of Video oooplooopad inthisRUclips

  • @unvergebeneid
    @unvergebeneid 8 лет назад +14

    Wow, interesting concept, nicely explained ... well done!

  • @alpardal
    @alpardal 8 лет назад +1

    Kudos to Mike, videos with him are always fun and well explained!

  • @martindinov932
    @martindinov932 7 лет назад +1

    Thank you Computerphile for the great videos you put up.

  • @Davidemmanuelkatz
    @Davidemmanuelkatz 7 лет назад

    So basically the whole convolution part is to "reduce" the dimensions, to then pass the information into a deep net?
    Really awesome videos ! Extremely addictive :)

  • @BeCurieUs
    @BeCurieUs 8 лет назад +27

    Wow, "convolution process" just sounds a lot like abstraction that brains do....I think these guys are really onto something here...I dig it

    • @chris_1337
      @chris_1337 8 лет назад +1

      +Christopher Willis really interesting way of looking at it.. what an exciting time to be alive!

    • @chris_1337
      @chris_1337 8 лет назад

      +bibbly bobbly Thanks! Very interesting. I once read that the patterned hallucinations of LSD are probably caused by the acid disrupting the signal in our retina. Pretty interesting stuff.. Is that plausible in your opinion?

    • @thomasgandalf4111
      @thomasgandalf4111 8 лет назад

      +bibbly bobbly thanks well said

    • @bug2k4
      @bug2k4 8 лет назад +1

      As far as I know, the kernels that CNNs learn on the first hidden layer (without prior knowledge) are also very similar to patterns our visual cortex reacts to. So it's maybe even closer than you thought it is.
      I definitely find this quite fascinating :)

    • @RedNNet
      @RedNNet 8 лет назад +2

      The part of the brain responsible for most mammalian intelligence (the neocortex) is considered hierarchical by many, sort of like neural networks. But it's not like there's a single layer of neurons in each level. Each level has millions of microcolumns of neurons (around 100 neurons per column), 3 to 9 or more layers (depending on how you count and the location in the hierarchy) with distinct properties, and multiple types of connections (inhibitory, excitatory, modulatory, different durations of effect, etc.) There are hundreds or thousands of underlying common characteristics in the neocortex alone, whereas neural networks have maybe one or two dozen underlying characteristics.
      I suspect some of that is just plumbing to deal with things like metabolism, but neural networks (except hierarchical temporal memory, which doesn't do anything the brain definitely doesn't do) are pretty unlikely to lead to brain-like intelligence. They're really useful, but the way they are designed is like trying to reinvent the computer by mimicking transistors and no other characteristics of computers.

  • @UberAlphaSirus
    @UberAlphaSirus 8 лет назад +68

    Would you mind putting links in the description for annotated link videos for us mobile users, thanks.

    • @Computerphile
      @Computerphile  8 лет назад +87

      +Sirus done >Sean

    • @UberAlphaSirus
      @UberAlphaSirus 8 лет назад +11

      Thanks

    • @baconology
      @baconology 7 лет назад +1

      Thanks. Annotated links are pretty much always off in the future. At least for nerds?

  • @dogukan463
    @dogukan463 3 года назад

    I was trying to understand cnns and dr mike comes to the rescue

  • @rubenkrupper5259
    @rubenkrupper5259 Год назад

    Might be the best intuitive description I've come across!

  • @_tyrannus
    @_tyrannus 8 лет назад +79

    Extremely good explanation of things that, until this series on deep learning, were just black magic to me !

    • @compulsive_curiosity
      @compulsive_curiosity 8 лет назад +10

      +turarwanaa Lucky you, watched it twice and I still think he is a dark wizard.

    • @HexerPsy
      @HexerPsy 8 лет назад +4

      +turarwanaa +J Simmons
      But doesnt it mean that its all just an optimization program that does the magic?
      The training images get run through the process, it produces a value. The settings are tweaked slightly, the result are compared and one is better than the other. Rince and repeat.
      Working with some optimization programs myself, the trick is in how the algorithm is programmed to make large or small tweaks to settings...
      Its like finding the tallest mountain in the area while blind... Does crossing the valley lead to a taller mountain or should you just go up hill?
      It all seems CPU horse power dependent to me o.o

    • @thomasgandalf4111
      @thomasgandalf4111 8 лет назад +1

      +HexerPsy yes it's pretty much brute force, no magic. as with most things machine...

    • @Vulcapyro
      @Vulcapyro 8 лет назад +2

      Thomas Gandalf It isn't anywhere near brute force either.
      The "magic" is in why neural networks work so well at all compared to other methods. At a low level it looks fairly similar to other optimization methods, but the structure of the network and how it abstracts is very important. It makes much less sense than a naive perspective suggests.

    • @thomasgandalf4111
      @thomasgandalf4111 8 лет назад +1

      +Vulcapyro something that runs to iteratively tweak parameters of mathematical formulae until it finds the best possible solution, i.e. explores or exhausts the state space, is pretty much the definition of brute force.... granted CNN don't usually exhaust the state space but make informed decisions on which parameter set to try next

  • @michaelsidorov157
    @michaelsidorov157 2 года назад

    In minute 9:07 - the size of the image changes not because it computes only the middle pixel, but because it fits in the size of the image less times than the images' width and height.

  • @Locut0s
    @Locut0s 8 лет назад +1

    Wow exciting. Fantastic explanation and I can really see the power of this! Can't wait to see where deep learning and computer AI in general goes in the years to come. We are on the edge of some very exciting stuff.

    • @RedNNet
      @RedNNet 8 лет назад

      Personally, I think hierarchical temporal memory will have a larger impact in the long run because it uses the brain as a constraint and won't have to rely so much on human ingenuity.

  • @lewisb8634
    @lewisb8634 8 лет назад +30

    This guy seems cool - I like the videos he presents! :)

  • @davidm.johnston8994
    @davidm.johnston8994 6 лет назад +2

    Thank you so much! Aside for entertaining me for years now, this video has actually helped me in my personal little research in programing an AI in a simple game using Tensorflow. (Is it overkill ? Sure. Is it fun to do and learn? Heck yeah!)

  • @TrabberShir
    @TrabberShir 8 лет назад +19

    "I'd have to start by programming up linux" he says while sitting in front of a WPF book

    • @DarkmoonUK
      @DarkmoonUK 8 лет назад +12

      ...because Computer Scientists are only allowed to reference one Operating System? I don't get it.

    • @amirabudubai2279
      @amirabudubai2279 8 лет назад +4

      Linux is better for long(multiday) computations because it is more stable and uses less resources in the background; it also happens to make the project more reproducible because versions of linux don't become dysfunctional with time like windows. Even if a CS has windows on their personal, there is no reason to think they wouldn't use linux on there workstation.

    • @baconology
      @baconology 7 лет назад

      agreed he is not capable of this but he doesn't care because he has WORK TO DO.

  • @omegasrevenge
    @omegasrevenge 8 лет назад +35

    I would have loved to hear examples of where these are getting used and what kind of impact they have on our way of life!

    • @Kram1032
      @Kram1032 8 лет назад +17

      +abschussrampe Google is currently pushing them onto basically everything. I'm not sure that _all_ their services use them yet but increasingly they do.
      For the following they either have talked before about _planning_ to use this technology or they already use it directly in the services you may or may not love to use by them:
      Google Maps
      Google Car driving
      Google Now suggestions
      Google Search
      RUclips Thumbnails
      RUclips video suggestions
      Google Translate
      Google Photos
      DeepMind (the guys behind AlphaGo)
      Allo, their new messaging app
      probably lots more
      Other big guns who either talk about or already do use these:
      Apple
      Microsoft
      Facebook
      Amazon
      probably lots more
      Nowadays, if you are on the internet, chances are you are using a service that in one form or another relies on deep learning and convolutional networks. You can do an insane number of semi-cognitive tasks with them. They _do_ have their limits in their current form but development goes rapidly.

    • @IceMetalPunk
      @IceMetalPunk 8 лет назад +3

      +abschussrampe They're used in quite a few places. One example I saw used a CNN to classify the activities occurring in a video--for example, to learn how to tell whether a video is of someone hiking, mountain biking, swimming, or canoeing. It could be expanded with more data sets to classify other types of activity, which in turn would allow our future AIs to understand what's happening around them instead of having to be told what's going on before knowing how to react.

    • @Vulcapyro
      @Vulcapyro 8 лет назад +1

      +abschussrampe Self-driving cars (Autos, you might say) are essentially all implemented as systems of CNNs at this point, if you want a particular example that will likely significantly change our way of life.

    • @mrkwse4415
      @mrkwse4415 8 лет назад +2

      If you're outside the EU/Canada, Facebook uses CNN for facial recognition to tag photos

    • @thallium200
      @thallium200 8 лет назад +4

      Rest assured the government is using them to identify not only who you are but what you're doing. We're going from facial recognition to activity recognition.

  • @nand3kudasai
    @nand3kudasai 8 лет назад +1

    I love the topics from this guy and he explained it very well. Though his accent is a little difficult for me. Awesome video and very cool how you get to reference and link all those other previous videos

    • @Rajeshgandh
      @Rajeshgandh 2 года назад

      Me too, I enabled the subtitles, Any way its a great video

  • @anwul4
    @anwul4 8 лет назад

    Now I know what to work on between handing in my Master thesis and defending it. Cause model my Artificial Neural network towards a Convolutional Neural Network. Might indeed bring up my accuracies in regards to recognizing game events based on Electroencephalogram and Eye-tracking data.

  • @yesim18duh14
    @yesim18duh14 8 лет назад +1

    This was super awesome, I love this Mike guy!

  • @tubeyoukonto
    @tubeyoukonto 6 лет назад

    Have to do a work on a paper about imagenet and deep convolutional neural networks. This video explained sooo much! Thank you!!!

  • @Lougehrig10
    @Lougehrig10 7 лет назад +1

    So really, Machine learning is creating an automated task to find enough differences that are unique to a specific thing so that you can then assume an outcome with enough confidence

  • @gabetower
    @gabetower 8 лет назад

    I love Mike's videos on image processing.. Keep em up!

  • @aligator381
    @aligator381 8 лет назад

    Another great video!
    I really like the fact that you create annotations for relevant or prerequisite videos and stuff, but maybe they would be more useful if they opened in a new tab. I don't want to lose where I was on this video, when I open and go through the annotated video.

  • @titouchose6534
    @titouchose6534 7 лет назад

    knowing a little bit how the human visual system is working, it's seems like you're actually describing it...
    And that's scary and thrilling at the same time.

  • @MistThief
    @MistThief 7 лет назад +164

    I am a neural network watching videos about neural networks.

    • @musicjetstream2476
      @musicjetstream2476 7 лет назад +33

      we need to go deeper

    • @timconnors3386
      @timconnors3386 6 лет назад

      this is amazing

    • @dzlcrd9519
      @dzlcrd9519 6 лет назад

      Moeシt wow

    • @kaidatong1704
      @kaidatong1704 6 лет назад +7

      aren't we all?

    • @ianallen738
      @ianallen738 4 года назад +2

      I am a neural network inputting and outputting comments about neural networks watching videos about neural networks. The singularity is nigh.

  • @tryfonmichalopoulos5656
    @tryfonmichalopoulos5656 3 года назад +3

    4:00 - That is a somewhat misleading statement right there my friend! The reason why convolution is used is not due to the fact that it is needed to downsample the input space so that the computer will not melt; the reason is that if you dont use the kernel based method typically known as convolutions you lose ANY topological information that relates the pixels with each other. There is literally no chance that a multilayer perceptron type of neural network could achieve accuracy anywhere near a convolution neural network and the reason is as stated previously, that the topological relationship of the pixels is lost once its reshaped in a one dimenionsal vector with space equal the total amount of the pixels. In fact, latest architectures do not even attempt to downsample the information and pooling techiniques are considered to be obsolete, further proving how wrong the statement you made at 4:00 is.

  • @JacksMacintosh
    @JacksMacintosh 8 лет назад +4

    This guy is awesome

  • @siotsoni9854
    @siotsoni9854 8 лет назад +17

    "... and there'll be a different representation of my face transformed in some way to be useful." BRILLIANT

  • @szeredaiakos
    @szeredaiakos 5 лет назад

    I like how he throws around "corners and edges" and the begining of DL corners and edges was actually a prediction but in reality, the slices of the most capable nets looks absolutely nothing like corners and edges and a whole lot more like noise.

  • @jeffg4686
    @jeffg4686 2 года назад

    amazing explanations here. Thanks for the share. This has made things more digestible.

  • @aurinator
    @aurinator 3 года назад +1

    Absolutely fascinating! Great stuff, thanks for making it & sharing. I'm suspecting convolutional neural networks (CNNs) are possibly the solution for any potentially subjective classification, like the images in your video, but now wondering if a CNN is eventually equal to a Support Vector Machine for OBJECTIVE classifications, e.g. solutions to a mathematics problem.

  • @ZimoNitrome
    @ZimoNitrome 8 лет назад +351

    3 deep 5 me learning

  • @jaffarbh
    @jaffarbh 2 года назад +1

    Interestingly, there is a platform out there called Accuval and they claim their house valuation is by far the best because they use ANN (fully connected rather than CNN).

  • @31337flamer
    @31337flamer 8 лет назад

    hes best computerphile prof.. deep and on point.. would love to work with him :O

  • @andreylebedenko1260
    @andreylebedenko1260 4 года назад

    Looks like the next step will be combination of CNN with temporal one. Like this: keep on moving CN kernel over the image using this path (as learned before) and keep on feeding TN with data, until we have 99% detection certainty.

  • @Omiiee
    @Omiiee 6 лет назад

    The James Acaster vibes are so strong in this guy. Perhaps all this revising has turned my brain to mush, but this video really helped :D Thank you!

  • @stef-ruvx
    @stef-ruvx 6 лет назад +1

    The dog sound effect gave me a bit of a chuckle

  • @TurtleTyrant
    @TurtleTyrant 7 лет назад +2

    Best nap ever

  • @VicFreg19
    @VicFreg19 8 лет назад +1

    How would you handle different sized images in the training data set? According to what I understood, the number of neurons (and weigths) depends on the number of pixels.

  • @cmdody
    @cmdody 7 лет назад +3

    We want video about PNN(Probabilistic Neural Network)

  • @theforester_
    @theforester_ 2 года назад

    what a class! big shout out from Brazil

  • @klam77
    @klam77 8 лет назад

    A beautiful description! Well done.

  • @КаринаМашанова
    @КаринаМашанова 2 года назад

    Very cool video. I understood and saw clearly! Thank you so much for such content!!

  • @RBYW1234
    @RBYW1234 4 года назад

    So all the options you have for choosing what kind of new car you want to buy makes up a neural network.
    Right input and you get a possible car choice.
    How do you make it into a deep learning by-product.
    Ive seen curiosity used to train.
    This stuff is scary fun.

  • @iagocasabiellgonzalez7807
    @iagocasabiellgonzalez7807 8 лет назад +1

    Extrordinary explanation, thanks!

  • @jonysonic3595
    @jonysonic3595 3 года назад

    when you learning basic for a project and you find it's your supervisor in this video XD

  • @johnappleseed8839
    @johnappleseed8839 7 лет назад

    Expected a video about neural networks analyzing sentiment to help news outlets adjust their narrative :o

  • @flits1
    @flits1 4 месяца назад +4

    update: they are indeed a big deal

  • @galan8115
    @galan8115 3 года назад

    Man, i would really like to see the work behind the root tips convolutional network.

  • @alexrossouw7702
    @alexrossouw7702 8 лет назад

    You should use growing media-free hydroponic systems to for viewing healthy roots. I'm sure you are aware of the "speaking plant approach" or perhaps the benefit for using imaging and CNN's for monitoring crops, as plants don't talk binary...

  • @tho207
    @tho207 8 лет назад

    wow that was delightfully well explained, I enjoyed the video so much.
    please ask him to talk about RNNs too!

  • @Carofdoom1126
    @Carofdoom1126 3 года назад +1

    '"someone" came around and applied this to imagenet and got great results'
    Well those someone's won the 2018 Turing award for that work LOL (LeCun for that work in particular. Bengio and Hinton for similar work)

  • @jerrodmilton5776
    @jerrodmilton5776 8 лет назад +7

    Is this the process that Google used for its image recognition software that can be run backwards to "dream up" images of the things it can recognize. So if the program can recognize an image of a cat you can reverse it and have it generate a picture of what it thinks a cat looks like.

    • @karlkastor
      @karlkastor 8 лет назад +4

      +Jerrod Milton You are correct! How they basically do this is instead of adjusting the weights of the network to get the correct output value, they change the pixels of the input image, so that the CNN predicts e.g. a cat with higher certainty.

    • @black_platypus
      @black_platypus 8 лет назад

      +Karl Kastor Wait, this exists? Are there front-ends to those applications available?

    • @karlkastor
      @karlkastor 8 лет назад +5

      Benjamin Philipp Google Deep Dream. People have done several implementations for this since the original paper.

    • @karlkastor
      @karlkastor 8 лет назад

      Benjamin Philipp
      google Deep Dream. People have done several implementations for this since the original paper

    • @black_platypus
      @black_platypus 8 лет назад

      Karl Kastor
      Tanks - I've since found Deep Dream, but thanks for coming back for me :)

  • @DeJayHank
    @DeJayHank 8 лет назад +1

    Very interesting stuff. Been working a lot with computer vision and at least tried a bit of ANN, so to see them combined like this is intriguing.
    I'm currently working with Fringe Projection Profilometry and I think that could be quite cool to show on this channel too.
    Basically it's a way to get a depth map of an object by projecting a sinusodial fringe image on it, and taking 3-4 photographs where the fringe function is in different phases. Then you combine these images with mathematics!

  • @PicturesqueGames
    @PicturesqueGames 8 лет назад

    Classic neural net model doesn't work anyway. when you remove input from the node it still produces echoes of that input - that's how biological neurons work.
    How can you apply this to image processing? returning echo signals are learned via neural link storages for better results which system deems more favorable. Basically that means that your library can store node assigned info and each node instead of doing full processing theoretically can pull out saved neural link for just detecting a small portion of familiar input, run checks on that one and high-tune it to a degree when you'll need, say only 1/4th of sobel convolution etc.

  • @johnthegod
    @johnthegod 8 лет назад +2

    This sounds very interesting, it gave me a nice flashback to my AI studies a few years ago.
    How would this method hold up against noisy images or partially occluded images once the network is trained? For example if you trained a CNN to recognise your face from n images, could you wear a phantom of the opera mask and still expect it to recognise you?

  • @SIC66SIC66
    @SIC66SIC66 8 лет назад +1

    I LOVE these technical video's. It would be great to see such video's on hardware too.

  • @GoodWoIf
    @GoodWoIf 8 лет назад

    Presumably images is just one type of information you could feed into one of these. You could feed in text, or patient vitals/symptoms, economic data, etc..

    • @IceMetalPunk
      @IceMetalPunk 8 лет назад

      +GoodWoIf Yep. Mike tends to focus on image analysis in his videos because he is, in fact, an image analyst, but CNNs can be used on pretty much any data set you can think of--as long as you have enough data to train them and a decent number of possible convolutions to apply.

  • @ysantamorena5150
    @ysantamorena5150 8 лет назад +1

    thank you for this video as well!

  • @DirkArnez
    @DirkArnez 11 месяцев назад

    "Pro WPF in C# 2010" is what i got without problem from the video

  • @MaxIme555
    @MaxIme555 7 лет назад

    Great and comprehensive video!

  • @leo333333able
    @leo333333able 8 лет назад

    good explanation

  • @peschebichsu
    @peschebichsu 3 года назад

    Would be nice to see you talk about RNNs

  • @casperTheBird
    @casperTheBird 5 лет назад

    He said at the beginning that feeding a whole picture of pixels would be too much data and too many pixels but its hard for me to see why this method would be much better

  • @syawkcab
    @syawkcab 8 лет назад +4

    What's the first sentence he says? "This is kind of a full opt vice's videos on deep learning?" I replayed it like 30 times and I can't figure out what he's saying

    • @Computerphile
      @Computerphile  8 лет назад +11

      +syawkcab "This is kind-of a follow up to Brais' video on deep learning"

    • @syawkcab
      @syawkcab 8 лет назад +3

      OHHHH!

  • @clays6359
    @clays6359 5 лет назад +1

    Please do a video on how CNN's are applied to Natural Language Processing (NLP). Usually RNN's are, but CNN's can also be used.

  • @Gabagool22
    @Gabagool22 4 года назад

    What an amazing explanation!

  • @mibo747
    @mibo747 2 года назад

    MAny thanks for LECTURE

  • @judgeomega
    @judgeomega 8 лет назад +1

    My intuitiion says that using an HSI image format would have much better results than rgb as shadows would be simplified.

    • @evelynfinegan4687
      @evelynfinegan4687 8 лет назад

      RGB was just the example used here, I don't think he literally meant they use RGB to produce different versions of the image.

    • @Wizarth
      @Wizarth 8 лет назад +3

      Heck, three of the convolutions might well convert from RGB to H, to S, to I, and/or some combination of them.

    • @dezmoanded
      @dezmoanded 8 лет назад

      right, the information is exactly the same so it might not matter to the Net

    • @RedNNet
      @RedNNet 8 лет назад +1

      The type of input matters because it helps when the input correlates strongly with reality. You could map every possible input to a pre-determined random representation, and it would probably have a lot more trouble, especially with generalization. It helps if similar inputs correspond to similar outputs.

  • @philippetrov4881
    @philippetrov4881 7 лет назад

    If the process is looked like to be a hash algorithm, then the collision is what we are looking for at the end :)

  • @JD_Mortal
    @JD_Mortal Год назад

    I'm actually trying to do this in reverse... A form of one-shot pass to detect "everything detectable", within an image. Mentally, the impossibility of "this system", is that I would have to shrink the whole universe down into a single pixel. Extrapolating it out, starting at the pixel-level and working up to some fixed detail of "absolution".
    That is like me giving you $200,000 and then asking you... "Tell me everything I can buy with this." Or giving you "the web", and asking you "how much did it cost to buy everything found online"... In one pass.
    Where there is a will, there is a way. I think I found a way... To do this, to a practical extent. A lot less training required and greater potential for "more info in a smaller space". Totally ignoring the fact that it detects more than a few "demanded" things at one time. Sometimes people just don't know what they are looking for, until you show it to them. Why examine the same picture ten times to figure out that it has a dog in it, because "dog" was the last thing you asked it to detect in your last set, because you ran out of neural-net detection ability after just four items in a detection.
    You know what's funny... They don't need a resolution of "0 and 1" for an output to be "sure" it's possibly a dog. If they actually evaluated the neural-net, after weighting... they would realize some processing contributed NOTHING or nothing significant to the detection process and that could have been eliminated and more time spent processing something that contributed MORE to the weighting. Going all the way down to those last pixels, which also had many things that contributed nearly nothing to the output. The detection could have jumped-out ten levels earlier if it only used "processing relevant to dogs", when detecting dogs. That thumb-print, alone, is worth it's weight (pun intended) in gold.
    Honestly, the pixels aren't important, it's the "detectable attributes" and the arrangement of them, which is important. The specifics of details are only important to specific things. Not when doing "general detections", like "dog" or "person". Data that already exists as "Labradors" and "golden retrievers" and "mutts", with specifics that have weights already learned. Weights that can be normalized to represent "dog", without ever having to train one actual "generic dog" as a "dog".

  • @DiegoAndrade
    @DiegoAndrade 7 лет назад

    Thank you for such great explanation !

  • @lddevo88
    @lddevo88 8 лет назад

    I think for future videos you should set up the camera on a tripod and speak directly to the camera, to us. But otherwise this is very well explained and demonstared!

  • @kp-ce1uk
    @kp-ce1uk 8 лет назад

    Surprisingly simple.

  • @Joe_vanni
    @Joe_vanni Год назад

    This guy is a monster ! He explain so well

  • @Phagocytosis
    @Phagocytosis 8 лет назад

    Right, so you can adjust the weights with back propagation. I'm not sure if I understand how that works in detail, but I can imagine the principle. However, what concerns me is, how do you/how does the algorithm determine which combinations to make in all the intermediate layers? Mike was talking about how it might be edges, might be corners, might be brightness in the middle, we won't know, so we let the algorithm work it out, but how does it work this out? Because it sounds (from 9:55) like this whole thing has to be set up before we can test/train it to begin with.

    • @Phagocytosis
      @Phagocytosis 8 лет назад

      Matthew Taylor Right! I think I understand, and that makes sense. Thank you for the explanation.

  • @bladeqmaster
    @bladeqmaster 8 лет назад

    That "Pro WPF for C# 2010 book at the back". :)

  • @Bozemoto
    @Bozemoto 4 года назад

    So effectively the filter is a single neuron hooked up to an X by X section of the previous layer, replicated across the entire previous layer. Like a mass produced chip.
    So if you have 10 convolutions in each layer and have 4 layers doesn't that mean you end up with 10000 images/features by the end?

  • @clintbellanger
    @clintbellanger 8 лет назад

    So neat. Do these CNN libraries use graphics cards for some calculations? Some steps of this remind me of pixel shaders.

    • @michaelpound9891
      @michaelpound9891 8 лет назад +3

      +Clint Bellanger Oh yes, in fact without GPUs it would take far too long to do any of this. Combined with some of the developments in CNNs themselves (e.g. Relu), GPUs have made all of this possible.

  • @jackdklee1396
    @jackdklee1396 7 лет назад

    Are there different ways to implement the library? How do people in competitions make their algorithm better using the same library?