neural network from scratch in C

Поделиться
HTML-код
  • Опубликовано: 30 сен 2024

Комментарии • 247

  • @cybernerddante
    @cybernerddante 2 года назад +564

    Especially impressive in C versus the "Python for everything" approach

    • @duxoakende
      @duxoakende 2 года назад +64

      Eh usually when you see python projects for everything they're using a 3rd party lib to handle it for them. Which makes sense, python is a high level scripting language and not designed to be a heavy lifter, rather allowing libs written in lower level languages handle the load for intense computations while abstracting it in an incredibly useful manner. You could do this same thing in python without using a 3rd party library, it's just gonna be way slower in pure python

    • @duxoakende
      @duxoakende 2 года назад +9

      @@ccriztoff maybe. Honestly me personally I'd just like to learn rust, but go is on the list

    • @duxoakende
      @duxoakende 2 года назад +7

      @Walter Hartwell White Ngl that's one of my favorite activities, writing cpython modules lol. Although cython is much easier tbh :/

    • @misaalanshori
      @misaalanshori 2 года назад +13

      in most machine learning projects i know, python is usually just tape to connect libraries which was probably written in C (or other high performance compiled languages). So as a whole, the python code is probably very small.

    • @shaekahmed8842
      @shaekahmed8842 2 года назад

      @@duxoakende can I contact with you? I wanna discuss about "cython"

  • @arjuniyer777
    @arjuniyer777 2 года назад +184

    I was always under the impression that creating a neural network, ESPECIALLY IN C would be something too advanced to even understand. However, I understood everything you mentioned and you put it together in such a concise and straightforward manner. Great video!

  • @m4rt_
    @m4rt_ 2 года назад +26

    Media: "AI is going to take over the world"
    Me: *shows image of a cat to AI*
    AI: "That's a frog"

    • @LightOffArchives
      @LightOffArchives 2 года назад +2

      More like
      This is 99.5% frog and 0.5% police car

    • @m4rt_
      @m4rt_ 2 года назад +1

      @@LightOffArchives AI: "This shadow is a 600 year old human"

    • @m4rt_
      @m4rt_ 2 года назад

      @HtAne
      Me: *shows image of cat*
      AI: "that's a cat"
      Me: *shows the same image, but with one changed pixel*
      AI: "that is not a cat"

  • @phoneaccount6907
    @phoneaccount6907 2 года назад +38

    Very dense story, you condensed a 2 hours lecture in 9 minutes. Impressive!

  • @DenisovichDev
    @DenisovichDev 2 года назад +86

    Hey can you please increase the font size a bit in your future videos, it's a bit small. Love the video!

    • @markkraay
      @markkraay  2 года назад +30

      Definitely! Thanks for the suggestion!

    • @mhmmdshaz98
      @mhmmdshaz98 2 года назад +3

      Well, while we're on that matter, can you please mention the name of that font 😛

    • @aarona3144
      @aarona3144 2 года назад +3

      @@markkraay I just found your video/channel through a recommendation so I dont know how many videos you've made since this one but get yourself a good mic setup (a screen in front of the mic for example). It should fix all the pops while you're speaking.
      You've got potential to create a really good channel here so look into it. Should be a good investment.

  • @Avighna
    @Avighna 2 года назад +18

    Finally, a machine learning tutorial that doesn't use a toy language.

  • @demelengopnik7187
    @demelengopnik7187 2 года назад +13

    read it as "neural network in scratch", was very intrigued

  • @stefanogagliardi4665
    @stefanogagliardi4665 2 года назад +108

    You left me speechless, finally advanced programming content.
    Thanks for sharing and your time! I hope for more high-level videos like this; other than "how to center a div".
    Explaining the ML in this way is rare, I would think if doing a course in this way, not the usual one explaining how to use "Keras" or "PyTorch", there are none that explain the concepts and implementation as you do! :)

    • @markkraay
      @markkraay  2 года назад +14

      Thank you so much! I'm glad you enjoyed :) There is definitely more to come!

    • @alexandrubragari1537
      @alexandrubragari1537 2 года назад +8

      You think "how to center a div" is not advanced?

    • @nyzss
      @nyzss 2 года назад +4

      nah dude you wrong for the "how to center a div" comment

    • @0xfeedcafe
      @0xfeedcafe 2 года назад +1

      @@nyzss true because there is a lot of videos out there about stuff like this video

    • @yochem9294
      @yochem9294 2 года назад +2

      explaining ML this way is not rare. It’s literally how it’s teached in university. I would always advice to learn it ‘from scratch’ if people say to me they want to learn about ANN’s :)

  • @jacxta
    @jacxta Год назад +11

    C Tip: If you are trying to recreate the project on windows, use calloc() instead of malloc() to create the structs. Especially important for the matrices so they get contain zeroes initially instead of remaining values from memory.

    • @inqonthat1463
      @inqonthat1463 Год назад +3

      In my first run through of the code after getting it to compile under VC++ I only got a 0.003 success rate. This seems to fix that! Got 0.901. Thanks.

  • @justinmitchell3822
    @justinmitchell3822 2 года назад +83

    Linear algebra nitpick: dot products are an operation on two vectors ( en.wikipedia.org/wiki/Dot_product )
    What's described in the video around 2:15 is matrix multiplication ( en.wikipedia.org/wiki/Matrix_multiplication )

    • @Davi-c4q
      @Davi-c4q 2 года назад +11

      to be honest I think depending on the context dot is used for matrix multiplication. It's the kind of convention that changes through time. Even numpy does that

    • @gtgunar
      @gtgunar 2 года назад

      they are (and also matrix-vector/vector-matrix) essentially the same operation, in APL, the inner product(a higher order function) of sum and multiply gives you the needed function. Essentially, it's vector-vector dot product, but if you use rank polymorphism, it works out the same.
      Rank polymorphism is automated size matching, when you have a scalar, and a vector, and a valid function applied, so for example: 1+2 3 4=3 4 5.
      For matrices it's a bit more complicated but works out just as well.
      Essentially, a matrix is a column vector of row vectors, in APL, and you pair up a column vector of row vectors, with a bunch of row vectors. The shape and order is kept.

    • @jackgao5681
      @jackgao5681 2 года назад

      they're the same thing!

    • @philperry6564
      @philperry6564 2 года назад +2

      @@jackgao5681 The result of a dot product is a scalar, while the result of matrix multiplication is a matrix.

    • @DreamzAnimation
      @DreamzAnimation 2 года назад +1

      @@philperry6564 A scalar can be thought of as a 1 by 1 matrix. If you have two n-dimensional vectors, you can define the dot product as the single element returned from the vector multiplied by the other's transpose. In this sense, the inner product is an example of matrix multiplication, resulting in a 1 by 1 matrix result.

  • @recarsion
    @recarsion 2 года назад +64

    This gives me so much nostalgia for when I tried to do literally the same thing but in C++. In the end I gave up because something was not quite right mathematically, it always ended up stuck after a few iterations of learning and could make no further progress. I thought I understood the math but I could never figure out where I went wrong. This gives me a lot of inspiration to come back to that project, or possibly re-write it as in the meantime I realized I don't like C++.

    • @tylerruiz3476
      @tylerruiz3476 2 года назад +4

      Sounds like you might have had a problem with vanishing gradients or lack of precision. Did you normalize the inputs?

    • @nsfeliz7825
      @nsfeliz7825 2 года назад

      i hate c++

    • @recarsion
      @recarsion 2 года назад

      @@tylerruiz3476 Yes my inputs were normalized. It might be worth a shot to try without. I thought vanishing gradient was only supposed to be a problem in deep networks and I only had 1 hidden layer, but it may indeed be the problem.

    • @recarsion
      @recarsion 2 года назад +7

      So a bit of an update, I've just sat down and fixed the whole thing in 3 hours. It wasn't vanishing gradient or floating point limitations or any of that. My math was a bit off plus the code itself is kinda terrible so I didn't spot some basic mistakes. Turns out I've gotten much better at programming in the few years since I've abandoned this project.
      My network now gets 90.6% efficiency with 784-100-10 layers, sigmoid activation, mean square error. The only difference to the network in the vid is that mine also has bias vectors.

    • @ithaca2076
      @ithaca2076 Год назад

      ​@@recarsionhell yea thats awesome

  • @illanes00
    @illanes00 2 года назад +14

    I'm asking myself why this video doesn't have a million views yet. Amazing job!!

  • @jbeltz5347
    @jbeltz5347 2 года назад +2

    neural network from scratch in assembly?

  • @12crenshaw
    @12crenshaw 2 года назад +4

    This dude just casually wrote machine learning neural network from scratch in C

  • @imbesrs
    @imbesrs 2 года назад +15

    Hey man, could you do a video on neural nets for dummies like myself? You explain things pretty well, but I got kind of lost after ~4 mins in the video. Would be a nice prequel to this one. Specifically neural nets in general, and exactly what what we were doing with the images themselves after getting the dot product of its matricies. Again, youre great at explaining things! Just subbed.

    • @bonybuinta1630
      @bonybuinta1630 2 года назад

      Take the kaggle course

    • @imbesrs
      @imbesrs 2 года назад +1

      @@bonybuinta1630 Whats the specific name so i take the right one?

    • @KALEB32154
      @KALEB32154 2 года назад

      ruclips.net/p/PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
      Great explanation from 3Blue1Brown. Seems to line up with this video's steps pretty well.

  • @SeaUsername
    @SeaUsername 11 месяцев назад +1

    why do we need music in the background , why ?????????

  • @baldeaguirre
    @baldeaguirre 2 года назад +1

    why did you use the sigmoid as an activation function instead of ReLu?

  • @austinwoodall5423
    @austinwoodall5423 2 года назад +3

    Consider this: before the industrial revolution, there were actually more jobs, but a great many of those jobs were performed by horses. This time we're the horses. The necessity for human reproduction is generated by the propagation of memes throughout our population. If those memes instead propagate through a separate medium ie ai, then human reproduction will be unnecessary

  • @d3psi488
    @d3psi488 2 года назад +2

    this is not natural coding behaviour... you can see the cursor bugging behind where text is being added. dude took the finished project and let it run through some other program to print it character by character for the video xD

    • @markkraay
      @markkraay  2 года назад +2

      Either that, or I'm just a perfect typer ; )

    • @d3psi488
      @d3psi488 2 года назад +1

      @@markkraay if you are, kudos to you sir, but i'm not placing my money on that ;)

  • @DJAntivenom
    @DJAntivenom 2 года назад +1

    I didn't know Ed Sheeran makes RUclips videos about NNs.

  • @matthewpublikum3114
    @matthewpublikum3114 2 года назад +1

    Hey no bias terms?

  • @愛
    @愛 2 года назад +1

    nice but get a pop filter

  • @klynx2599
    @klynx2599 2 года назад +1

    Hey, awesome video. But please adjust your mic settings. A pop filter would really help

  • @jpsalis
    @jpsalis 2 года назад +2

    neural network from scratch in scratch

  • @MsDuketown
    @MsDuketown 5 месяцев назад

    a makefile when you should use default.nix & neural.cabal?
    lol. What is this for video? Brocante technology aimed at adolescent wizz-kids?
    Note:
    I highlight 2 out of 3 weaknesses. Do you know the third? It has nothing to do with autism...

  • @nngnnadas
    @nngnnadas 2 года назад +2

    Well, yeah, it's a technology that produces black boxes.

  • @homelessrobot
    @homelessrobot 9 месяцев назад

    One suggestion I have about the matrix data structure is to NOT make it a pointer-pointer. Just make it a pointer. You can treat a 1 dimensional sequence as a two dimensional sequence with pointer arithmetic, like
    // this would go in the header file with the struct, not in the implementation, because
    // this makes it inlineable.
    static inline double *matrix_row(matrix *mat, size_t row)
    {
    return mat->entries + mat->cols * row;
    }
    and access individual elements like
    size_t row = 3;
    size_t col = 5;
    double elem = matrix_row(mat, row)[col];
    No double-pointers necessary. This also simplifies many of the operations that don't really care about the specific dimensions of the input, such as copy, flatten, fill, randomize, etc. It also simplifies the memory management and makes unary operations over matrices substantially more cache friendly.
    If you want to optimize even further to minimize pointer dereferencing, you can put the size of the matrix into the same memory region as the elements, and just pass around pointers to this. You would do this by turning the 'entries' pointer field into a flexible array member, like this:
    typedef struct matrix
    {
    size_t rows;
    size_t cols;
    double entries[];
    } matrix_t;
    the 'entries' field has to be the last field in such a structure. What it means is "I will allocate these immediately after the struct in memory". So then you do that, like
    (matrix_t *)malloc(sizeof(matrix_t) + rows * cols * sizeof(double))
    'sizeof(matrix_t)' here is only considering the 'rows' and 'cols' fields, because that is the minimum amount of space such a structure will take up. after which, you can essentially treat the struct as if 'matp->entries' is a pointer field. Its not though, its just syntax sugar for pointer arithmetic over 'matp'.
    This also makes it easier to handle matrices safely. If you want a matrix pointer to be constant, you just say 'const matrix_t *', now both the entries and the dimensions are treated as const because there isn't a second/third level of indirection which would evade constness.
    Just in general, and especially in c where you pay both a performance and a usability cost, flat structures are better when they are possible.

  • @inqonthat1463
    @inqonthat1463 Год назад

    Although I applaud your Star Trek utopian viewpoint, my life experiences have shown me a different future. Whereas in my youth (I saw TOS when it was... original) many of my friends had that same viewpoint, I don't see that in the average kid these days. Now, narcissism and video games seems to have the lion's share. Don't get me wrong... anyone older is even worse. The idea that someone put out of a job (white or blue collar) is going to "better" themselves in the fine arts or theoretical physics... just doesn't sound plausible. One has to has to eat and a shelter. Especially, when their family is going hungry and a billion other persons are unemployed. Even at my retired age, I can't even say, I won't have to deal with it. White collar jobs (and even the creative ones) are falling to AI now. Tesla will have a usable Blue-collar replacement robot this decade. Then what happens when millions of factory workers start going hungry? I don't see an answer. The genie can't be put back in the bottle.

  • @henrikvtcodes
    @henrikvtcodes 2 года назад +4

    Amazing video. Really clarified the basics of it all; especially those diagrams with the dots and lines. It’s just a bunch of math comparing values! Makes a lot of sense now.

  • @Superfastisfast
    @Superfastisfast 2 года назад

    thought he laterally was making it on scratch /: ya know block-coding

  • @Usualyman
    @Usualyman 2 года назад

    So... you just malloc a huge bunch of matrixes without free'ing it's memory after usage?
    Hope your computer have terrabytes of RAM! :D

  • @SeaUsername
    @SeaUsername 11 месяцев назад

    do school classrooms have to have background music these days ?? it is such an unnecessary annoying distraction. For gods sake stop it, we can watch a video without background music !

  • @krtirtho
    @krtirtho 2 года назад

    Plot Twist: This guy actually doesn't exist & completely made up by AI & this channel is operated by AI. The AI has created this video to assure us AI won't take over the world so that we remain calm while the AI actually take over the world
    Don't trust the AI
    _generated by the good AI_

  • @The101Superman
    @The101Superman 2 года назад

    >builds neural network from scratch
    >renders potato webcam at 1080p60

  • @mustafaaljshamee6593
    @mustafaaljshamee6593 2 года назад

    Hi, first of all I would express my thanks, however I've find an error of multiply function, just in case.
    Regards

  • @homelessrobot
    @homelessrobot 9 месяцев назад

    you say 'everyones favorite programming language; c' with an air of irony, but its closer to the truth than any of the alternatives.

  • @rubyciide5542
    @rubyciide5542 Год назад

    Ai take over your job???
    Bro these humans are so saturated in the job applications and im not getting a job forget about ai

  • @Jalae
    @Jalae 2 года назад

    just because the job is trival and will be solved without you doesn't mean some new magic task is going to pop up. people lose jobs regions have boom bust cycles. empires fall.

  • @Scherbiusthecringe
    @Scherbiusthecringe 11 месяцев назад

    What Algorithm did you use for matrix inversion?

  • @teal8365
    @teal8365 2 года назад +2

    once he discovers python it's all over. the next video will be "creating a singularity from scratch in Python"

    • @kitsune7229
      @kitsune7229 2 года назад +2

      For that C would be better xD
      It's faster and black holes actually have a memory leaks, you know hawking radiation xD

  • @platin2148
    @platin2148 2 года назад

    I only heard teslas going rough. Maschine Learning is pretty lame.

  • @microelectronics5732
    @microelectronics5732 Год назад

    Damn, this really catched me. I will definitely try this between the years.

  • @maxtory8063
    @maxtory8063 2 года назад

    Alex Trimboli teaches you neural networks

  • @platin2148
    @platin2148 2 года назад +2

    I suggest making a pool of memory from which you allocate calling malloc so often slows it down.

  • @cippo1995
    @cippo1995 2 года назад

    Hi, I don't see the license in your github repository: can I fork and modify this code?

  • @Gahlfe123
    @Gahlfe123 2 года назад +12

    this was a great video. wish i found this way back when i was taking a masters course in FPGA dealing with a neural network in C++

  • @sunpoke5317
    @sunpoke5317 2 года назад

    So... Am I supposed to understand anything?

  • @zulc22
    @zulc22 2 года назад

    neural network in scratch from c

  • @brunouribe8987
    @brunouribe8987 2 года назад

    C neural network =big brain shit
    Epic man

  • @connorkoury5434
    @connorkoury5434 2 года назад +2

    You think you'll ever post a video for convultional neural networks?

  • @xloppyschannel4881
    @xloppyschannel4881 2 года назад +1

    AI is limited very much, anything critical can't really be trusted.

  • @ThylineTheGay
    @ThylineTheGay 2 года назад

    you seem to have forgotten to add a license to the github

  • @ahmadmuslih
    @ahmadmuslih 2 года назад

    bruh because the music I slept in mid video

  • @irwainnornossa4605
    @irwainnornossa4605 2 года назад

    Jesus, put { on a new line, this is unreadable.

  • @JCtheMusicMan_
    @JCtheMusicMan_ 2 года назад +1

    Excellent presentation! I see so many uses for this in my head. My desire is to use ML for quickly analyzing large data sets. I need to train the neural network behind my eyes before I can implement such a network 😅

  • @i_love_python5862
    @i_love_python5862 2 года назад +1

    this channel is very underrated. gotta leave a comment before this takes off :)

  • @sirynka
    @sirynka 2 года назад +3

    How much slower is this implementation compared to tensorflow/pytorch (CPU mode)?

    • @markkraay
      @markkraay  2 года назад

      It really depends on the architecture of the model and the amount of training you wish to perform. For example, if you were to use a more efficient optimization algorithm, you would see quicker convergence. Any real speed up due to language efficiency (C is generally considered very fast) doesn't make a difference because those libraries are essentially written in C/C++, but provide bindings to other languages such as Python or Swift.

    • @sirynka
      @sirynka 2 года назад +1

      @@markkraay Yeah, and your naive implementation of linear algebra library does not use avx or multicore processing to further paralelise and speed up the computation.

    • @markkraay
      @markkraay  2 года назад

      Yes! I haven't looked too deeply into multicore processing with C and I'm completely unfamiliar to AVX, but those would be nice improvements.

    • @timtreichel3161
      @timtreichel3161 2 года назад +6

      @@sirynka The way the matrices are allocated is also slow. You want to allocate everything in one array for less cache misses. But this intuitive approach is great if you want to understand the basics and math behind NNs. Optimization can be done, once everything works and when you are serious about using NNs you'd want to use the GPU anyway.

    • @alefratat4018
      @alefratat4018 2 года назад

      From my experience, you can get around 10x speed-up between a naive C implementation (such as in this video) and a fully optimized one (such as used in NN inference engines). And it usually scales well, so the bigger the network, the larger the speed-up.

  • @timgoodliffe
    @timgoodliffe 2 года назад

    your lighting terifies me lol

  • @gasfeesofficial3557
    @gasfeesofficial3557 2 месяца назад

    next zuckerberg

  • @needlessoptions
    @needlessoptions 2 года назад

    Please for the love of god get a pop filter

  • @CT-cx8yi
    @CT-cx8yi 2 года назад +1

    Great video. But please, get a pop filter!

  • @alexdeng2184
    @alexdeng2184 2 года назад +1

    Amazing. Super simple breakdown and you strengthened my knowledge. Thank you!

  • @GlobalYoung7
    @GlobalYoung7 2 года назад

    thank you 🙏

  • @yuriy2090
    @yuriy2090 2 года назад +2

    This is a great video. I truly enjoyed it!

  • @csoham96
    @csoham96 2 года назад

    write in golang pls

  • @BryceChudomelka
    @BryceChudomelka 2 года назад

    was there data leakage?

  • @raphaelcardoso7927
    @raphaelcardoso7927 2 года назад

    Thanks for using C

  • @jomo_sh
    @jomo_sh 2 года назад

    what color scheme

  • @andydataguy
    @andydataguy 2 года назад +1

    Awesome video! Glad you did it in C 🙏🏾

  • @quantumastrologer5599
    @quantumastrologer5599 2 года назад

    ..... Dr. Manhattan?

  • @andreujuanc
    @andreujuanc 2 года назад

    pop filter pls xD

  • @damn_right_man8606
    @damn_right_man8606 2 года назад

    You are a genius

  • @phicoding7533
    @phicoding7533 2 года назад

    What ide is this?

  • @dylanalexander5163
    @dylanalexander5163 2 года назад +1

    absolutely wonderful video, top quality stuff

  • @calkenzo
    @calkenzo Год назад

    Malch

  • @jasonford7439
    @jasonford7439 2 года назад +1

    I've watched many ML videos and one that's missing is about input/outputs; if i want my input to be x format (image, audio file, whatever) and my output to be y (video, text, whatever) how do I make that happen? Is it as simple as just converting them to a sensible matrix? Are there caveats? Your video implies that certain matrix dimensions are desirable, why? What if each training input is multiple files? Or if I want the model to solve several distinct (but related) things together?

    • @Aditya-ne4lk
      @Aditya-ne4lk 2 года назад

      it is as simple as converting them to a sensible matrix.
      say you have 10 classes, and a dataset of images that have size 256 x 256. you want the output of your neural network to be a prob distribution over the 10 classes for an input, so the last layer of your neural network should essentially be (1,10) which will be a probability distribution. So you have to find a way of reducing (256 x 256) to ( 1 x 10) using your hidden layers. Depending on what "layers" you use, you will have to compute the size of the intermediate matrix. For example, there is a formula for Convolution Layers to calculate what the ouput size of the matrix should be, given a convolution stride and padding.

    • @jasonford7439
      @jasonford7439 2 года назад

      @@Aditya-ne4lk ok, I think I understand. What if my input per training example is multiple files? For example, a labelled picture of a cow and an audio file where it moos? How do I combine? What if I want the output to also be multiple things per item, like sex/weight estimation based on the combined audio/image?

  • @hozas8553
    @hozas8553 2 года назад

    WTF

  • @floxer
    @floxer 2 года назад

    Oh btw, because the topic popped up in the intro. If you're interested in Tesla's NN Architecture, you can watch their "AI Day" from August last year.
    ruclips.net/video/j0z4FweCy4M/видео.html
    (There are timestamps in the description. I would say the "Tesla Vision" part is for you then)

  • @marcotroster8247
    @marcotroster8247 2 года назад +7

    Great work! How fast does the training routine run compared to e.g. TensorFow? 😉
    I'm also a C and AI enthusiast. And as you've just demonstrated, it's 100% feasible to make this work without Python or any other black magic dependencies.
    It's still fascinating to me how some great, crisp math theory shines. I bet it's only as good because it was created in an environment of very low computational capability. Those engineers back then had to think of powerful but easy-to-compute math. And this is the result.
    Of course, MNIST classification is a really easy task, but it shows that this crazy resource waste in AI with GPU clusters is not always necessary. And plain Python is ~70 times slower than C. This means 98% (!!!) of computation goes to waste. Save the climate with better programming! 🌍

    • @marcotroster8247
      @marcotroster8247 2 года назад +3

      @@bowenfeng9750 Haha you're funny. TensorFow is only fast because it completely bypasses the Python C-API by telling it upfront with functional programming what has to be done. Then TensorFow builds a computation graph that doesn't need to call back into Python all the time. Otherwise it would be fckin slow, even though the C routines are fast 😂
      The Python C-API is actually a crazy bottleneck. If you're sending lots of small batches to e.g. NumPy, it's quite inefficient. Try it yourself. I've created my own C-Python chess extension and it's horribly slow even though the C code is blazing fast 😉
      PS: Please consider that the people you're talking to know what they're saying, little gatekeeper 😂😂😂

    • @stxnw
      @stxnw 2 года назад +3

      @@marcotroster8247 he deleted his comment, im dyinggg 😂😂😂
      anyways, i truly do not see the point of building an AI framework from scratch, whether in C or not, when hundreds of scientists way smarter than you and I have already written code 100x more optimized than any of us, packaged into a library.
      another thing, IIRC, vectorization and broadcasting in numpy can be as fast or faster than pure C, so im not sure why you think NumPy is inefficient, so long as you avoid the overheads.

    • @marcotroster8247
      @marcotroster8247 2 года назад +1

      @@stxnw Sure, if you're doing it right, Python doesn't add a crazy bottleneck while training. But not all devs in AI understand how a PC works. Lots of them are just mathematicians who think it's magic 😂
      And also don't forget about the IO during the other stages of data retrieval / data preprocessing / model deployment, etc. There's still lots of inefficiency because training is only the tip of the iceberg 😅
      What I'm criticizing is that people in AI have this extremely wasteful mentality of optimizing coding time vs. runtime. They would rather buy 10x bigger workstations than putting some thought into what they wanna achieve.
      And yes, doing AI from scratch for learning isn't the worst thing. Of course there are reasons why people use Python for more complicated models, don't get me wrong. But telling people "it's 100% unfeasible" is wrong, too 😂

    • @Daniel-ih4zh
      @Daniel-ih4zh 2 года назад

      This was not the state of the art until better computers came about. It's not an example of engineered powerful but easy to compute math

    • @marcotroster8247
      @marcotroster8247 2 года назад

      @@Daniel-ih4zh Ok, I wanna believe you, but can you provide an example 😉

  • @demon_hunter9547
    @demon_hunter9547 2 года назад

    if the output layer has the softmax activation function, then shouldn't the backpropagation also use softmax prime or something for calculating the gradient.

  • @inqonthat1463
    @inqonthat1463 Год назад

    Outstanding Video! I get so tired of and exit these videos that feel compelled to give us the history of the world from the Abacus-on or have to jump around and wave their hands. This was a breath of fresh air! Thank you. For those wanting more understanding of theory or background... I say do your own simple search of RUclips or Google. You'll get a million hits. This video stands alone for concise, usable work!

  • @saeye
    @saeye 2 года назад

    Its 3am where i live and your video autoplayed for me accidentally. I'm hooked. Will try this myself tomorrow. The end was sweet too man.

  • @chriscruz429
    @chriscruz429 2 года назад

    Implementing AI for every day businesses is a huge opportunity. The new job will be an AI consultant.

  • @nrdfoss
    @nrdfoss 2 года назад

    lol the way your face on the recording looks scared me (no offense btw i'm talking about the cam quality)

  • @AMFLearning
    @AMFLearning 2 года назад +1

    #amflearningbydoing @AMFLearning

  • @Max-bf9lm
    @Max-bf9lm 2 года назад

    Now, paralise this with OpenMP! (I'm genuinely interested seeing the performance..)

  • @ayato7429
    @ayato7429 2 года назад

    Python...

  • @janedoe6182
    @janedoe6182 2 года назад

    What about "Allocation free neural network from scratch"?

  • @alexanderskladovski
    @alexanderskladovski 2 года назад

    Imagine what performance issues from overhead people who use high-level abstractions to write concise and readable code experience.

  • @vtrandal
    @vtrandal Год назад

    Fantastic. Excellent. Thank you!

  • @prasaddd
    @prasaddd 2 года назад

    Surprisingly good Wi-Fi under the rock here 👍

  • @ensiopoidokoff7367
    @ensiopoidokoff7367 2 года назад

    Ahh yes, one can toss aside try/catch and still have non-local exits.

  • @sidhantsood5373
    @sidhantsood5373 2 года назад

    Off topic but what was the music used in the video?

  • @hongkyang7107
    @hongkyang7107 2 года назад

    I am trying to read the darknet src to use its api, this video helps.

  • @Panure
    @Panure 2 года назад

    Great video but you should invest into a pop filter

  • @rbnstmar
    @rbnstmar 2 года назад

    I find this mistake:
    mnist-from-scratch-master]$ make
    /usr/bin/gcc -std=c99 -c main.c -o main.o -lm
    main.c: In function ‘main’:
    main.c:17:3: error: too few arguments to function ‘network_create’
    NeuralNetwork* net = network_create(784, 300, 10);
    ^
    In file included from main.c:7:0:
    neural/nn.h:15:16: note: declared here
    NeuralNetwork* network_create(int input, int hidden, int output, double lr);
    ^
    make: *** [main.o] Error 1

  • @simonshaffer5813
    @simonshaffer5813 2 года назад +1

    Really cool, interesting to see it in C

  • @torphedo6286
    @torphedo6286 Год назад

    Great video, this helped enormously. I was struggling a lot with understanding matrices and making a sane matrix struct.

  • @RemiNesheim
    @RemiNesheim 2 года назад

    Hey man - great video! Definitely gonna keep watching your videos. I really enjoyed the discussion points you highlighted at the end. Keep doing what you do, because you're really good at it :)

  • @NootNooter
    @NootNooter 2 года назад

    slightly worried about the question mark box at the end of the output