Codes are available on my Github: github.com/okdalto/conv_visualizer If you’d like to see more of my work, please visit my Instagram: instagram.com/okdalto
Don't be so harsh, this is one of the greatest human advancements In technology. This is the future, one step closer to true tech cities an a ai God. That can protect mankind long in to the future. from AI an alien threats. Compared to all the other shit. After this the rest was all bs. An protect it with your life because. This is like summoning waves of enemies to. Steal, destroy, or change it's code. An they will try to go back in time to now to try to stop the thing from being created. Or to alter it. 100%
A Convolutional Neural Network (CNN) processes an image of a number step by step to figure out what the number is. First, the image goes through filters, which are like small windows that slide over the image and perform a simple operation: multiplying and adding numbers. This step creates new data that highlights specific features of the image, like edges or patterns. Next, a small adjustment called a bias is added to the result, and a function called ReLU (Rectified Linear Unit) is applied. ReLU is simple: it changes any negative numbers to 0 and keeps positive numbers as they are. After several rounds of this process, the data is flattened into a single line, a step called reshape. The flattened data is then multiplied with pre-learned values using matrix multiplication. Just like before, a bias is added, and another activation function is applied. In the final step, the network predicts probabilities for each possible number (like 0, 1, 2, etc.). The brighter the result for a number, the more likely the network thinks it’s the correct answer. This way, the CNN can analyze the image and make an accurate prediction.
for anyone wondering, the ai doesnt actually take that long to recognize the number, its just giving a visualization of the neural network and each layer of the NN edit: didn't expect this to blow up, my explanation is not very clear, I'd recommend searching up a 3b1b video on how neural networks work if you're interested
So you're saying it slowed down to show us exactly what it does in the background when we input the 3 so we can see it all happen even though it can pretty much do it in .000000003 seconds?
@@robertwolfgan Not necessarily slow down. The visualisation is just a completely separate thing made to mimick what the neural network does. The actual neural network probably just calculated the stuff and sent over the results to visualizer.
@@NotJackAlderson No, every other image in its training base just alters its weights and biases. If this is considered cross referencing, then your brain is too.
I thought most nural mets only use one hidden layer and not all these iteratios shown? Just the input, hidden, and then the prediction? I know more can be added, but thought the current trend was towards a wider layer than a deeper one due to processing power. Or maybe I'm misunderstanding the visualization.
@@grahamskippyI think it really depends on what kind of model you're developing, in this case its CNN. But yah I don't think all the progressions shown are layers, i think most of them are iteration of different filters to extract features like edges and corners of the digit in one convolution layer. But there should definitely be more than 1, images would usually be downsampled to an abstract form for a more optimized detection.
@@grahamskippyIf you use a complex feature extraction method, it will reduce the need for hidden layers. However, in the video that is processed, the image is directly calculated for each pixel, so hundreds of layers are needed to process it.Maybe you can try checking one of the object detection implementations using the Yolo model, it also has hundreds of layers.
@berryyyydaaarrrrruuuu7772 ahhh, I hadn't realized that it was individually predicting each pixel like that, that's awesome! Where did you learn about machine learning? I've only had one class and it sparked an intrest in me, but any more courses aren't really applicable to my degree.
It's a visual representation of how a computer vision convolutional AI model breaks down an image until it gets to where it shows it's guess for which number they were given, it's really cool because this is actually just Way cooler looking version of the babies first AI from tensorflow, very cool to look at
I understand most of what you're saying, but can you give me some like application of this or its uses in a more macro or real world environment? Not being snarky or anything, just genuinely curious.
What you see on the screen is basically the hidden layer (multiple layer of "neuron" with different weight), these assign to each pixel a weight, and depending on the sum of the same it decides which number it is
작품설명 은 인공지능의 작동 원리를 예술적 표현을 동원하여 구현한 인터랙티브 미디어 작품이다. 이 작품은 복잡한 AI의 계산 과정을 빛과 소리로 바꿔 그 과정을 관객이 직관적으로 이해할 수 있도록 고안되었다. 데이터가 신경망을 통해 전달, 학습되는 과정을 시각적, 청각적으로 표현하여 관객이 AI가 실시간으로 정보를 처리하고 반응하는 모습을 직접 체험할 수 있다. AI가 만들어내는 이미지와 영상이 점점 더 많아지는 가운데 은 AI의 기본 원리가 사칙연산 같은 간단한 계산에서 시작된다는 점을 알기 쉽게 보여준다. 관객이 AI의 추론 과정에서 실제로 어떤 일련의 과정이 벌어지는지를 확인할 수 있게 돕는 이 작품은 동시에 AI가 우리의 일상에서 어떤 역할을 하고 또, 앞으로 어떻게 더 진보할지에 관한 궁금증을 자극한다.
In plain English there are two parts to machine learning/AI, the learning bit ie learning what lots of numerals look like and then later on the inference bit ie doing the real guessing job given an input and an exisiting model from lots of learning. This video is the inference process, ie breaking down the input into a matrix or grid and lots of layers and figuring out it if the input has a high probability of being one of the known numerals in the model of previous learning. It's not seeing, it's not guessing, it's just working on probabilities albeit really fast. And it is magical! What an awesome exhibit!
@@HermanWillems Machine learning and humans cannot do learning and inference at the same time. It just looks like it is happening at the same time as it occurs very rapidly. In decision-making in humans, we tend to consider our past experiences ("learning") in making a decision ("inference"). That's why they say "we learn from our mistakes", since we usually want to avoid making the same mistakes again. That's the same thing with Machine Learning. The learning part is called "fitting", and the inference part is called "predicting".
Programmed a basic version of this in a machine learning course, it's honestly extremely interesting. A lot of people who are uneducated or uninformed about AI freak out about it and think it's gonna "go rogue" or "take over", when really all AI is, is a bunch of math and data processing. Granted that math and data processing can be unbelievably complex, but that's what it boils down to. Pattern recognition and turning data input into a data response.
@@loguski755regardless. He is right, I am a programmer myself. I specialize in stuff like this and A.I simply won’t “go rogue” unless someone programs it to. Even with A.I learning it’s all still open to be edited so let’s say in the “what if” event that an A.I manages to go rogue and even more extreme situation, it has learned how to be able to access multiple devices across multiple networks all over the world. We could simple edit that part out and chill it out then program a failsafe so it detects “if x, y, z activity is noticed then partition it off and delete it” but obviously written in whatever language it was written in, which is more than likely a variant of Python
Our brain is literally just mathematics, and data processing. Neurons fire or they don’t fire neurotransmitters, some compounds that act as allosteric modulators make said receptors react greatly or react less to given neurotransmitters. Bing bop boom, do that billions of times and you have a complex pattern recognizing and learning machine called a brain. We work on 1s and 0s aswell in a way.
@icedawggg I second this. Cause it doesn't take a genius to know that humans KNOW patterns. It's literally built into our instincts. "Have a routine. Remember this in school." Like, the only thing separating machine from human is literally 3 things. Self-awareness, the ability to think freely, and the ability to not care for ourselves.
Yeah it's because the human brain is an infinitely more complex data processing machine than a computer doing a bunch of linear algebra. We're just adapted to quantity and variety, and can acquire and process data on our own without supervision. But I also assure you that what we do to learn to recognize a "3" isn't really less complex, just the inference is insofar as we just pull 3 out of our memory.
Any of the last large language models are unintelligible. Old AI was understandable. Now even experts just know theres connections but not necessarily the function or true reason for everything needed or happening
@@muffinconsumer4431 CNNs like this one aren't a complete mystery like more modern transformer networks. They work by learning what convolution kernels need to be applied from one layer to the next, to extract relevant features and classify them. Exactly what features are deemed relevant is up to the network, but it's still understandable because you can literally see what features each layer is extracting from the source input in visualizations like this one.
For those wondering this is a visualization for a Convolutional Neural Network. It takes the image performs a convolution, then performs max pooling to lower the amount of pixels. Then it flattens it(why it becomes one line at the end) and puts it into a neural network. Very cool visualization for this!
@@user-tr2dh4xx6uthe convolution layers uses a number of filters to filter out certain features and the max pooling lessens the resolution. After flattening it, the neural network uses the filtered out features to guess the answer
This is really cool, the video is a representation of how the computer is sorting through the data (pixels turned on or off 1,0), the code itself isn’t terribly complex either. The real challenge is getting it accurate more often than not
@@big3584 Wait, you know neural networks don't actually use brain cells, right? It's called a neural network because the structure of it is similar to how brain cells are structured, not because it uses them. Also, I definitely agree with OP, it is a very cool representations of the process. It needs a voiceover or some kind of explanation though. People that don't already know what it is are probably not going to understand exactly what's going on.
This is a very nice visualization of a convolutional neural network. It’s not the original LeNet. Because it has 5 times 5 feature maps in the end but this has 4 times 4 with 2 times 2 pooling. In fact the input seems to be 32 pixels standard size for cifar. After the first bottleneck it went down to 17 then 8 then 4. This means we use padding for the convolutions. It could be a modified version of AlexNet though.
Very very good representation of the Ai's ability and they way it breaks down the given knowledge to known knowledge and can deduct from that what the number is, also the beeping reminds me of old msn wall papers and made me happy great work dude
As i understand it, watching it create the conditions for such a number to arise, and referencing it's dataset to identify it, is elegant and beautiful
Omg i suddenly understand how my buddy was trying to explain Ai to me. That sums up how he explained a computer can be taught to recognize shapes, numbers and letters. It's only a few lines of math but computated many thousands or even millions of times a second.
If you are talking about "TLDNEEE," this is not the number predictor. This is a level generator, it has a more complex neural network than in this video
Man, that's pretty dope illustration of convolutional neural network, common deep learning algorithm. In this case for alphanumeric character recognition. Today's job for AI or ML engineers
My mind was somewhere else when I scrolled to this, and slowly my thoughts faded when I started focusing on the screen and was frozen for like 30 seconds
Our brains are like diamonds. While computers have to perform such precise and complex calculations, we can do it within two seconds just after seeing it. Guys don't hate yourself, you are special and unique, think positively and don't give up on life, we still have a hundred years to experience the vast world!
Yes, it is a classifier. An image is the input and a number between 0 and 9 is the output. VEry roughly something like ChatGPT works the same way where the input is your prompt and the output is the next word of the output (the whole thing is then ran over and over to produce the whole output).
Это визуальная работа алгоритма обработки изображения. Каждый новый слой - это новый этап работы программы. Сперва отсекается область где есть рисунок, далее идёт анализ попиксельно и упрощение большого массива данных изображения до небольшого, а в конечном итоге до упрощенного. Последний этап - программа выполняет поиск того, что получилось с известными ей изображениями и выдаёт результат. Это происходит быстро, на экране вся эта работа программы визуализирована, чтобы показать как же много обработки и вычислений происходит чтобы программа поняла, что ты ввёл число 3.
Это показывают распознавание мвшинными алгоритмами числа. Сначало используют маски для уменьшения размерности , а потом после этой маски формируют математический вектор и сравнивают его с эталлоными векорами изображений чисел. Так работают машинные алгоритмы распознавания.
This looks like AI is like an active brain that needs to analyse what it is seeing whole ours is just subconscious brain that looks at things and goes "ooh! That looks like a heart cloud" Imagine having to do a long calculation of everything in your room every few minutes to understand what you are doing 😂 and manual breathing
This ai or CNN is very parallelized meaning it does a bunch of small computations at the same time. In reality you could run this ai on your phone a couple hundred times a second. Our brain is also parallel and probably works very similarly to a CNN. Both extract features and from those exracted features they extract more. For example, if I wanted to detect a dog Id look for the curves in its face. Then from the curves I extracted Id look at groups of curves that look like a nose and then from that extracted feature id say it looks like a dog nose. Both the human brain and cnn work similarly. Although the human brain seems like magic, it is not and as far as we know anything that happens in the brain could theoretically be described with logic and therefore a computer.
A CNN is a type of neural network (ai) that is typically used to categorize images. This is a visualization of how it works: it continuously breaks the image down into smaller layer. The layer becomes smaller because it extracts only the defining information of an image. And it keeps doing that until it reaches a final layer where it can easily determine what number the guy drew.
Hi, just to complement, in case you want to investigate more, in the video you can see first the input layer that coincides with the input image, in this case the 3. After that, the layers that become thinner but longer and that are growing thowards you are called convolution layers. The next group of layers (the wide line) is called a dense layer. And finnaly the layer with the numbers is an output layer. Each layers outputs are generally connected the the inputs of the next layer and has its function in the CNN. The convolution and dense layers usually contain trainable values called weights and their values change during training to better solve the problem.
I know how a CNN works, so I know what each step is doing. However, if someone doesn’t know how a CNN work already, this will make no sense to them whatsoever. That’s not a very good explanation for CNN all things considered.
@@FringeSpectre It's a display showcase... They're not expecting you to suddenly become an expert on convolutional neural networks. At best it provides a surface level explanation to gain the audience's interests.
In the following video, please share the following: 1. For the Google Chrome browser, there is an add-on "AI Classifer by HIVE", it detects a real photo, image, video, audio, text or all this neuronic 2. Please tell us the working principle in the following video. I even took a picture of the screensaver from the TV of some channel and this add-on realized that the neuronic image
Codes are available on my Github:
github.com/okdalto/conv_visualizer
If you’d like to see more of my work, please visit my Instagram: instagram.com/okdalto
Awesome design! A small touch could be an explanation of the layers as they unfold. I think people would find it more interesting!
Yes could you please explain what's going on here. I have a guess but I'm sure there is more.
Don't be so harsh, this is one of the greatest human advancements In technology. This is the future, one step closer to true tech cities an a ai God. That can protect mankind long in to the future. from AI an alien threats. Compared to all the other shit. After this the rest was all bs. An protect it with your life because. This is like summoning waves of enemies to. Steal, destroy, or change it's code. An they will try to go back in time to now to try to stop the thing from being created. Or to alter it. 100%
A Convolutional Neural Network (CNN) processes an image of a number step by step to figure out what the number is. First, the image goes through filters, which are like small windows that slide over the image and perform a simple operation: multiplying and adding numbers. This step creates new data that highlights specific features of the image, like edges or patterns.
Next, a small adjustment called a bias is added to the result, and a function called ReLU (Rectified Linear Unit) is applied. ReLU is simple: it changes any negative numbers to 0 and keeps positive numbers as they are.
After several rounds of this process, the data is flattened into a single line, a step called reshape. The flattened data is then multiplied with pre-learned values using matrix multiplication. Just like before, a bias is added, and another activation function is applied.
In the final step, the network predicts probabilities for each possible number (like 0, 1, 2, etc.). The brighter the result for a number, the more likely the network thinks it’s the correct answer. This way, the CNN can analyze the image and make an accurate prediction.
That's really cool
Can it play bad apple tho
"Waaaaaaaas THIS your card?"
I laughed SO hard at this
With extra "aaaaaaaaaaaaaaaa"s 😅
@@ayoalex whats the reference?
😂😂😂
Woke up the cat with my laughter 😂
My brain when the waiter asks me how many scoops of ice cream I want
Best comment so far
With the sounds too, obviously
... Wouldn't happen to be 3, right?
Strakkaiatella , no Straitiaella , no Statatata ... um Chocolate!
I still somehow replied with “Yes”
for anyone wondering, the ai doesnt actually take that long to recognize the number, its just giving a visualization of the neural network and each layer of the NN
edit: didn't expect this to blow up, my explanation is not very clear, I'd recommend searching up a 3b1b video on how neural networks work if you're interested
So you're saying it slowed down to show us exactly what it does in the background when we input the 3 so we can see it all happen even though it can pretty much do it in .000000003 seconds?
@robertwolfgan well I wouldn't say it's instantaneous but yes it takes like less than a second
@gtALIEN Understood. Thank you, I honestly didn't know what I was watching until I found your comment.
@@robertwolfgan Not necessarily slow down. The visualisation is just a completely separate thing made to mimick what the neural network does. The actual neural network probably just calculated the stuff and sent over the results to visualizer.
An awful visual representation.
This is exactly how my brain processes writing my name on a test I didn't study for
Well done with this comment
You dropped this m'Lort. 👑
Mine even makes that noise.
Epic 🔥 😂
As an AI engineer, yes this is similar to your brain process, so you're technically correct.
“That looks like a three. But first, let me cross reference it with these other 50,000 images just to make sure.”
Funny thing is, it's not cross referencing anything
@ is it not looking for patterns to recognize from its training data base, inherently cross referencing with other material?
@@NotJackAlderson No, every other image in its training base just alters its weights and biases. If this is considered cross referencing, then your brain is too.
Correct ! @@infrakazos
5만개의 이미지를 통해 학습하고 1만개로 테스트 하겠지 ㅋㅋㅋㅋ
Best visual representation of how neural network works
I thought most nural mets only use one hidden layer and not all these iteratios shown? Just the input, hidden, and then the prediction? I know more can be added, but thought the current trend was towards a wider layer than a deeper one due to processing power. Or maybe I'm misunderstanding the visualization.
@@grahamskippyI think it really depends on what kind of model you're developing, in this case its CNN. But yah I don't think all the progressions shown are layers, i think most of them are iteration of different filters to extract features like edges and corners of the digit in one convolution layer. But there should definitely be more than 1, images would usually be downsampled to an abstract form for a more optimized detection.
@@grahamskippyIf you use a complex feature extraction method, it will reduce the need for hidden layers. However, in the video that is processed, the image is directly calculated for each pixel, so hundreds of layers are needed to process it.Maybe you can try checking one of the object detection implementations using the Yolo model, it also has hundreds of layers.
😂
@berryyyydaaarrrrruuuu7772 ahhh, I hadn't realized that it was individually predicting each pixel like that, that's awesome! Where did you learn about machine learning? I've only had one class and it sparked an intrest in me, but any more courses aren't really applicable to my degree.
The most difficult part of this isn't the image recognition. It's the visualisation 😆
fr
The original concept would be hard to make. But yeah, after that's figured out the visualisation is nuts!!
Today, not 50 years ago.
@@brandoneubank447stop
Yeahh I bet the task itself was finished in a few seconds, but visualizing it made it bottle neck
“Look at what it takes them to mimic a fraction of our power!”
It's slowed down 1/1000 speed to visualize how the ai processes data. In a real world scenario it would have that answer faster then you can blink.
@@Kenneth91619 I'm pretty sure OP was just being silly
Ai is smarter than you could imagine brother and we're only making it smarter faster than we can keep up😂😂
@@Kenneth91619 I dont have eyelids and my eyes hurt. thanks for bringing it up.
For now.
I feel like that machine just did more math than I do in a whole year.
😂
it did. probably not even including the huge amount of math needed just to run the display.
Dude your phone is doing a hundred billion calculations per second. Like obv 😂
A whole year? That's adorable.
Didn’t know what the hell it was, but watched the whole thing because it was satisfying
It's a visual representation of how a computer vision convolutional AI model breaks down an image until it gets to where it shows it's guess for which number they were given, it's really cool because this is actually just Way cooler looking version of the babies first AI from tensorflow, very cool to look at
I understand most of what you're saying, but can you give me some like application of this or its uses in a more macro or real world environment? Not being snarky or anything, just genuinely curious.
You know how you can copy text from pictures now? That's what it's doing
I think
But sometimes the text isn't copied correctly, so the most obvious outcome for the computer isn't always the correct outcome
Hmmm, this 3 seems to made of 3... 🤔 I think it's a 3.
It's easy for your brain, but getting a computer to actually know what a 3 is from a poor drawing of 3 is pretty impressive
@@derfvcderfvc7317you are right , people don’t understand machines don’t think
@@derfvcderfvc7317 跟現在的義務教育方式何其相似~很慶幸出生在千禧年前
Not at all... @@derfvcderfvc7317
What you see on the screen is basically the hidden layer (multiple layer of "neuron" with different weight), these assign to each pixel a weight, and depending on the sum of the same it decides which number it is
작품설명
은 인공지능의 작동 원리를 예술적 표현을 동원하여 구현한 인터랙티브 미디어 작품이다. 이 작품은 복잡한 AI의 계산 과정을 빛과 소리로 바꿔 그 과정을 관객이 직관적으로 이해할 수 있도록 고안되었다. 데이터가 신경망을 통해 전달, 학습되는 과정을 시각적, 청각적으로 표현하여 관객이 AI가 실시간으로 정보를 처리하고 반응하는 모습을 직접 체험할 수 있다. AI가 만들어내는 이미지와 영상이 점점 더 많아지는 가운데 은 AI의 기본 원리가 사칙연산 같은 간단한 계산에서 시작된다는 점을 알기 쉽게 보여준다. 관객이 AI의 추론 과정에서 실제로 어떤 일련의 과정이 벌어지는지를 확인할 수 있게 돕는 이 작품은 동시에 AI가 우리의 일상에서 어떤 역할을 하고 또, 앞으로 어떻게 더 진보할지에 관한 궁금증을 자극한다.
와 감사합니다
굳
Спасибо
まだわかりません
Низкий поклон❤
Me, going to a party: "Alright, just act cool."
Person: "Hey what's your name?"
My brain:
LOL😂😂😂😂
Relatable
Нейронные сигналы заторможена доходить походу
Лол😂😂
Says "3" out loud
"3"
In plain English there are two parts to machine learning/AI, the learning bit ie learning what lots of numerals look like and then later on the inference bit ie doing the real guessing job given an input and an exisiting model from lots of learning. This video is the inference process, ie breaking down the input into a matrix or grid and lots of layers and figuring out it if the input has a high probability of being one of the known numerals in the model of previous learning. It's not seeing, it's not guessing, it's just working on probabilities albeit really fast. And it is magical! What an awesome exhibit!
Now tell me can we do learning and inference at the same time like humans? Or is that not yet solved? Like a constant evolving network?
Sooooo, it's just looking at the "3" and matching the card.
Jk. Just wanted to frustrate you.
Softmax
i loved is CNN. google have devel so many complex algorithms now.
@@HermanWillems
Machine learning and humans cannot do learning and inference at the same time. It just looks like it is happening at the same time as it occurs very rapidly.
In decision-making in humans, we tend to consider our past experiences ("learning") in making a decision ("inference"). That's why they say "we learn from our mistakes", since we usually want to avoid making the same mistakes again.
That's the same thing with Machine Learning. The learning part is called "fitting", and the inference part is called "predicting".
This is what my brain does whenever my wife asks “what did I tell you to do this morning?!”
Finally a CNN we can all enjoy.
This is funny
HA
a congesting neural network
😂😂😂
😂😂😂
Programmed a basic version of this in a machine learning course, it's honestly extremely interesting. A lot of people who are uneducated or uninformed about AI freak out about it and think it's gonna "go rogue" or "take over", when really all AI is, is a bunch of math and data processing. Granted that math and data processing can be unbelievably complex, but that's what it boils down to. Pattern recognition and turning data input into a data response.
You don't really know what you're talking about, no offense.
@@loguski755nice bait, but you let it rot so bad even he could smell it before you opened that old can💀
@@loguski755regardless. He is right, I am a programmer myself. I specialize in stuff like this and A.I simply won’t “go rogue” unless someone programs it to. Even with A.I learning it’s all still open to be edited so let’s say in the “what if” event that an A.I manages to go rogue and even more extreme situation, it has learned how to be able to access multiple devices across multiple networks all over the world. We could simple edit that part out and chill it out then program a failsafe so it detects “if x, y, z activity is noticed then partition it off and delete it” but obviously written in whatever language it was written in, which is more than likely a variant of Python
Our brain is literally just mathematics, and data processing. Neurons fire or they don’t fire neurotransmitters, some compounds that act as allosteric modulators make said receptors react greatly or react less to given neurotransmitters. Bing bop boom, do that billions of times and you have a complex pattern recognizing and learning machine called a brain. We work on 1s and 0s aswell in a way.
@icedawggg I second this. Cause it doesn't take a genius to know that humans KNOW patterns. It's literally built into our instincts.
"Have a routine. Remember this in school."
Like, the only thing separating machine from human is literally 3 things.
Self-awareness, the ability to think freely, and the ability to not care for ourselves.
This is best explained CNN (convolution neural network)
This is use for image recognition in deep learning 😮❤❤
We studied this in school and its crazy how much training a model needs just for this
What school did you go to?
@@rvesvewell it does not really matter as you probably see this in every school that has a Compsci department
Yeah it's because the human brain is an infinitely more complex data processing machine than a computer doing a bunch of linear algebra. We're just adapted to quantity and variety, and can acquire and process data on our own without supervision.
But I also assure you that what we do to learn to recognize a "3" isn't really less complex, just the inference is insofar as we just pull 3 out of our memory.
I know how these work but it’s just so satisfying to see them in action!
I may never be able to fully comprehend this stuff. Luck you
Any of the last large language models are unintelligible. Old AI was understandable. Now even experts just know theres connections but not necessarily the function or true reason for everything needed or happening
Explain pls
Nobody knows how these work, lol
@@muffinconsumer4431 CNNs like this one aren't a complete mystery like more modern transformer networks. They work by learning what convolution kernels need to be applied from one layer to the next, to extract relevant features and classify them. Exactly what features are deemed relevant is up to the network, but it's still understandable because you can literally see what features each layer is extracting from the source input in visualizations like this one.
Didn't expect this to be this helpful, thank you for providing the code as well 🙏
For those wondering this is a visualization for a Convolutional Neural Network. It takes the image performs a convolution, then performs max pooling to lower the amount of pixels. Then it flattens it(why it becomes one line at the end) and puts it into a neural network. Very cool visualization for this!
I still have no idea what tf that means
@@user-tr2dh4xx6ufr
@user-tr2dh4xx6u I felt like I did. Then I read your comment and realized I actually didn't.
@@user-tr2dh4xx6uthe convolution layers uses a number of filters to filter out certain features and the max pooling lessens the resolution. After flattening it, the neural network uses the filtered out features to guess the answer
@@user-tr2dh4xx6utrue, sounds cool and smart, ig
I like how in the middle it turns it to a generic 3 as an internal understanding of the 3ness of the image.
I’ve seen a visualisation of the neural network part like this before, but the visualisation of the convolutional layer is fantastic!
This is really cool, the video is a representation of how the computer is sorting through the data (pixels turned on or off 1,0), the code itself isn’t terribly complex either. The real challenge is getting it accurate more often than not
No, it is not sorting, it is maxpoolling.
It’s also a neural network. People use rat brain cells and stuff and it makes computers smarter
@@big3584 Wait, you know neural networks don't actually use brain cells, right? It's called a neural network because the structure of it is similar to how brain cells are structured, not because it uses them.
Also, I definitely agree with OP, it is a very cool representations of the process. It needs a voiceover or some kind of explanation though. People that don't already know what it is are probably not going to understand exactly what's going on.
This is a very nice visualization of a convolutional neural network. It’s not the original LeNet. Because it has 5 times 5 feature maps in the end but this has 4 times 4 with 2 times 2 pooling. In fact the input seems to be 32 pixels standard size for cifar. After the first bottleneck it went down to 17 then 8 then 4. This means we use padding for the convolutions. It could be a modified version of AlexNet though.
Yes! Exactly what I thought on 1st look, thank you for writing it down
Wut
🤓👆
Seemed obvious
Exactly what i was thinking. Who DIDNT see these exact things.... 🫣@xmrdazo
“LOOK WHAT THEY NEED TO MIMIC A FRACTION OF OUR POWER”
Very very good representation of the Ai's ability and they way it breaks down the given knowledge to known knowledge and can deduct from that what the number is, also the beeping reminds me of old msn wall papers and made me happy great work dude
I didnt know what was i watching, glad i watched it all as I still dont know what i watched
Brother you are NOT alone.
As i understand it, watching it create the conditions for such a number to arise, and referencing it's dataset to identify it, is elegant and beautiful
Omg i suddenly understand how my buddy was trying to explain Ai to me. That sums up how he explained a computer can be taught to recognize shapes, numbers and letters. It's only a few lines of math but computated many thousands or even millions of times a second.
There were way to many neurons here btw, you would need maybe 4 layers for something like this usually
- На проект ушло 10000$
- Погоди но ты говорил что на распознавание цифр ушло 500$. Куда ушли остальные деньги ?
- Ну...
Ну как, маркетологам, дизайнерам и на премии руководству.
Impressively well visualized processing of Neural Network I think
That is indeed convoluted.
This is an awesome representation of neural network working.
Every You tube short is a race between the progress bar and the video getting to a freaking point.
That is a great visualization of the CNN "hello world" program of recognizing handwritten digits 🙂
Someone made this in geometry dash btw
what?
@@L3S4nwhat?
That's insane.
@@blucorwhat?
If you are talking about "TLDNEEE," this is not the number predictor. This is a level generator, it has a more complex neural network than in this video
Man, that's pretty dope illustration of convolutional neural network, common deep learning algorithm. In this case for alphanumeric character recognition. Today's job for AI or ML engineers
YES! Yes, I did write the number 3! Amazing!
Bro flipped us off and NO ONE realised good job yt shorts
Ikr
Watching this with a fever hits different
So this what Internet Explorer used to do?
Guys your code is very cool, I have been doing research on deep learning vision for many years but your animation effect still shocked me
What a perfect piece of art for demonstrating CNN! We can even know the architecture from the video.
They nailed the naming when they used "convolutional" jesus f I'll never get that time back
It is actually really cool since it shows how a neural network actually works
My mind was somewhere else when I scrolled to this, and slowly my thoughts faded when I started focusing on the screen and was frozen for like 30 seconds
Wow finally, a CNN that isn't misinformation, the 3 was a 3.
Fr, mainstream media is soo messed up even AI is better at giving accurate information.
Our brains are like diamonds. While computers have to perform such precise and complex calculations, we can do it within two seconds just after seeing it. Guys don't hate yourself, you are special and unique, think positively and don't give up on life, we still have a hundred years to experience the vast world!
wow that's such a beautiful visualization!
CNN 원리를 그대로 표현했네
진짜 잘 만들었다
진짜 해보고싶은것
Draw a dingaling😂
I actually LOL'd because you know that was the 2nd thing they tried when it first came on line
I think I've heard it called TTP before aka time to penis
the most beautiful visualization of a neural network I've ever seen
Is that like a visualisation of how ai works?
Exactly, yeah. Its convolutional neural network which is basically one of the thought pattern of AI deep learning
Yes, it is a classifier. An image is the input and a number between 0 and 9 is the output. VEry roughly something like ChatGPT works the same way where the input is your prompt and the output is the next word of the output (the whole thing is then ran over and over to produce the whole output).
Not really. This is not like an artistic presentation of how a specific architecture works.
Вы тоже зашли в комментарии узнать, что там черт возьми, вообще происходило и какая была конечная цель этой длительной анимации?
Я так и не понял. Что этр?
Это визуальная работа алгоритма обработки изображения. Каждый новый слой - это новый этап работы программы. Сперва отсекается область где есть рисунок, далее идёт анализ попиксельно и упрощение большого массива данных изображения до небольшого, а в конечном итоге до упрощенного. Последний этап - программа выполняет поиск того, что получилось с известными ей изображениями и выдаёт результат.
Это происходит быстро, на экране вся эта работа программы визуализирована, чтобы показать как же много обработки и вычислений происходит чтобы программа поняла, что ты ввёл число 3.
@@damad91ты что такой душный проще описать не мог 😂😂😂
Это показывают распознавание мвшинными алгоритмами числа. Сначало используют маски для уменьшения размерности , а потом после этой маски формируют математический вектор и сравнивают его с эталлоными векорами изображений чисел. Так работают машинные алгоритмы распознавания.
@@ecojer1660если это количество букв вводит тебя в ступор, рекомендую начать читать. Попробуй начать с колобка, может осилишь
This is such a cool visualization of a CNN!
This looks like
AI is like an active brain that needs to analyse what it is seeing whole ours is just subconscious brain that looks at things and goes "ooh! That looks like a heart cloud" Imagine having to do a long calculation of everything in your room every few minutes to understand what you are doing 😂 and manual breathing
This ai or CNN is very parallelized meaning it does a bunch of small computations at the same time. In reality you could run this ai on your phone a couple hundred times a second. Our brain is also parallel and probably works very similarly to a CNN. Both extract features and from those exracted features they extract more. For example, if I wanted to detect a dog Id look for the curves in its face. Then from the curves I extracted Id look at groups of curves that look like a nose and then from that extracted feature id say it looks like a dog nose. Both the human brain and cnn work similarly. Although the human brain seems like magic, it is not and as far as we know anything that happens in the brain could theoretically be described with logic and therefore a computer.
@viniciusdugue3063 wow 👏
NNs are not AI, nor are they brains or anything like it. It is just statistical analysis.
Can someone explain what is happening?
A CNN is a type of neural network (ai) that is typically used to categorize images. This is a visualization of how it works: it continuously breaks the image down into smaller layer. The layer becomes smaller because it extracts only the defining information of an image. And it keeps doing that until it reaches a final layer where it can easily determine what number the guy drew.
@ oh damn that’s so neat, thanks man!
@@Lunar-White ur face recognition on phone, voice recognition, object detection on cars are all application of CNN. Basically deals any image data.
It is alive
Hi, just to complement, in case you want to investigate more, in the video you can see first the input layer that coincides with the input image, in this case the 3. After that, the layers that become thinner but longer and that are growing thowards you are called convolution layers. The next group of layers (the wide line) is called a dense layer. And finnaly the layer with the numbers is an output layer. Each layers outputs are generally connected the the inputs of the next layer and has its function in the CNN. The convolution and dense layers usually contain trainable values called weights and their values change during training to better solve the problem.
대박 멋집니다. CNN 뉴스인지 알았어요. 😂🎉🎉🎉
yeah cool animation but what dose it do?
You write a number and using a neural network it tells you what number you wrote.
complete useless @@HonestyLies
공대 미술 전시전 같은건가
squid game langauge😂
I think its a museum? Not sure but seems like it from the looks of it
와 너무 탐나네요
@@Kamilake squid game langauge😂
@anjuscuccos 🐙
This is a good visualization of whats going on in the background.
I know how a CNN works, so I know what each step is doing. However, if someone doesn’t know how a CNN work already, this will make no sense to them whatsoever. That’s not a very good explanation for CNN all things considered.
if someone can't understand it, we can call it Art
As someone who knows nothing about this, I was going to comment the same exact thing basically.
That's why they have that sign in front with a paragraph explaining it.
@@0x1EGEN one paragraph is not enough to understand everything going on on the screen. Come on bro lol.
@@FringeSpectre It's a display showcase... They're not expecting you to suddenly become an expert on convolutional neural networks. At best it provides a surface level explanation to gain the audience's interests.
This is what was happening behind the scenes with dial up internet
And thats how SAP works under the hood
In real time!
This is a great visualization of how the whole thing works.
This is like those super high tech screens you see at the background of a mid movie.
90’s cyber movie vibes.
I’m expecting Tom cruise to run out screaming “ TECH SUPPORRT!!”
This is giving off Sheldon with the bar code playing cards vibe
And remember this was designed to mimic the human brain, we are amazing
máquina lenta, quando ele fez, na minha cabeça eu já pensei número 3
Damn. That's exactly how I think!
Me: **Draws a middle finger**
I thought in 2024 car would be flying but instead we got this
I feel the seconds of my life whisked away watching that …
Alright, everyone here would have drawn a doodle, right? 😂
О да, это именно то, что я хочу видеть в 3 часа ночи перед работой 😊
“The power of 2 human eyes, ladies and gentlemen”
인스타 잘보고있습니다
드디어 제 알고리즘에도 뜨는군요!!!!!
Thought the person walking in the reflection was behind me 😂
Me when I have to write s 350 words essay on the Topic I never Heard of.
350 words is absolutely nothing. Essays are regularly around 5000 words in uni
@L3monsta schools ?
"That's great! What is that?" - Spider-Man
My wife's brain when I ask her what she wants to eat
Finally, now we can read the doctor's receipt
In the following video, please share the following:
1. For the Google Chrome browser, there is an add-on "AI Classifer by HIVE", it detects a real photo, image, video, audio, text or all this neuronic
2. Please tell us the working principle in the following video. I even took a picture of the screensaver from the TV of some channel and this add-on realized that the neuronic image
Это мой мозг перебирает все знакомые звуки, когда я не расслышал слово в предложении, а оно может кардинально менять общий смысл обращения ко мне ))
오 신경망 노드를 시각화 하면서 사람들이 보기 좋게 예술작품처럼 연출된게 좋네요.
ver isso chapado foi uma baita experiência
my eye is glued to the video
Convolution. I used to like doing the digital ones based in bits, quite fun. Analog ones are crazy hard 😅
Charity: "Think about the children!"
Me:
idrk what it just did but it looks cool
"i see you just drew something" *dial up noises*
what makes this more impressive is that this process is slowed down significantly. real ai softwares do this basically instantly
My brain, when the first question is Your Name
We did that 20 years ago at the university. 3-layer NN.
Shodan🔥🔥 system shock😜
This is what my ketamine trip looked like and sounded