neural network from scratch in C

Поделиться
HTML-код
  • Опубликовано: 4 ноя 2024

Комментарии • 247

  • @arjuniyer777
    @arjuniyer777 2 года назад +186

    I was always under the impression that creating a neural network, ESPECIALLY IN C would be something too advanced to even understand. However, I understood everything you mentioned and you put it together in such a concise and straightforward manner. Great video!

  • @cybernerddante
    @cybernerddante 2 года назад +568

    Especially impressive in C versus the "Python for everything" approach

    • @duxoakende
      @duxoakende 2 года назад +65

      Eh usually when you see python projects for everything they're using a 3rd party lib to handle it for them. Which makes sense, python is a high level scripting language and not designed to be a heavy lifter, rather allowing libs written in lower level languages handle the load for intense computations while abstracting it in an incredibly useful manner. You could do this same thing in python without using a 3rd party library, it's just gonna be way slower in pure python

    • @duxoakende
      @duxoakende 2 года назад +9

      @@ccriztoff maybe. Honestly me personally I'd just like to learn rust, but go is on the list

    • @duxoakende
      @duxoakende 2 года назад +7

      @Walter Hartwell White Ngl that's one of my favorite activities, writing cpython modules lol. Although cython is much easier tbh :/

    • @misaalanshori
      @misaalanshori 2 года назад +13

      in most machine learning projects i know, python is usually just tape to connect libraries which was probably written in C (or other high performance compiled languages). So as a whole, the python code is probably very small.

    • @shaekahmed8842
      @shaekahmed8842 2 года назад

      @@duxoakende can I contact with you? I wanna discuss about "cython"

  • @phoneaccount6907
    @phoneaccount6907 2 года назад +38

    Very dense story, you condensed a 2 hours lecture in 9 minutes. Impressive!

  • @demelengopnik7187
    @demelengopnik7187 2 года назад +15

    read it as "neural network in scratch", was very intrigued

  • @jacxta
    @jacxta Год назад +12

    C Tip: If you are trying to recreate the project on windows, use calloc() instead of malloc() to create the structs. Especially important for the matrices so they get contain zeroes initially instead of remaining values from memory.

    • @inqonthat1463
      @inqonthat1463 Год назад +4

      In my first run through of the code after getting it to compile under VC++ I only got a 0.003 success rate. This seems to fix that! Got 0.901. Thanks.

  • @m4rt_
    @m4rt_ 2 года назад +26

    Media: "AI is going to take over the world"
    Me: *shows image of a cat to AI*
    AI: "That's a frog"

    • @LightOffArchives
      @LightOffArchives 2 года назад +2

      More like
      This is 99.5% frog and 0.5% police car

    • @m4rt_
      @m4rt_ 2 года назад +1

      @@LightOffArchives AI: "This shadow is a 600 year old human"

    • @m4rt_
      @m4rt_ 2 года назад

      @HtAne
      Me: *shows image of cat*
      AI: "that's a cat"
      Me: *shows the same image, but with one changed pixel*
      AI: "that is not a cat"

  • @stefanogagliardi4665
    @stefanogagliardi4665 2 года назад +108

    You left me speechless, finally advanced programming content.
    Thanks for sharing and your time! I hope for more high-level videos like this; other than "how to center a div".
    Explaining the ML in this way is rare, I would think if doing a course in this way, not the usual one explaining how to use "Keras" or "PyTorch", there are none that explain the concepts and implementation as you do! :)

    • @markkraay
      @markkraay  2 года назад +14

      Thank you so much! I'm glad you enjoyed :) There is definitely more to come!

    • @alexandrubragari1537
      @alexandrubragari1537 2 года назад +8

      You think "how to center a div" is not advanced?

    • @nyzss
      @nyzss 2 года назад +4

      nah dude you wrong for the "how to center a div" comment

    • @0xfeedcafe
      @0xfeedcafe 2 года назад +1

      @@nyzss true because there is a lot of videos out there about stuff like this video

    • @yochem9294
      @yochem9294 2 года назад +2

      explaining ML this way is not rare. It’s literally how it’s teached in university. I would always advice to learn it ‘from scratch’ if people say to me they want to learn about ANN’s :)

  • @justinmitchell3822
    @justinmitchell3822 2 года назад +84

    Linear algebra nitpick: dot products are an operation on two vectors ( en.wikipedia.org/wiki/Dot_product )
    What's described in the video around 2:15 is matrix multiplication ( en.wikipedia.org/wiki/Matrix_multiplication )

    • @Davi-c4q
      @Davi-c4q 2 года назад +11

      to be honest I think depending on the context dot is used for matrix multiplication. It's the kind of convention that changes through time. Even numpy does that

    • @gtgunar
      @gtgunar 2 года назад

      they are (and also matrix-vector/vector-matrix) essentially the same operation, in APL, the inner product(a higher order function) of sum and multiply gives you the needed function. Essentially, it's vector-vector dot product, but if you use rank polymorphism, it works out the same.
      Rank polymorphism is automated size matching, when you have a scalar, and a vector, and a valid function applied, so for example: 1+2 3 4=3 4 5.
      For matrices it's a bit more complicated but works out just as well.
      Essentially, a matrix is a column vector of row vectors, in APL, and you pair up a column vector of row vectors, with a bunch of row vectors. The shape and order is kept.

    • @jackgao5681
      @jackgao5681 2 года назад

      they're the same thing!

    • @philperry6564
      @philperry6564 2 года назад +2

      @@jackgao5681 The result of a dot product is a scalar, while the result of matrix multiplication is a matrix.

    • @DreamzAnimation
      @DreamzAnimation 2 года назад +1

      @@philperry6564 A scalar can be thought of as a 1 by 1 matrix. If you have two n-dimensional vectors, you can define the dot product as the single element returned from the vector multiplied by the other's transpose. In this sense, the inner product is an example of matrix multiplication, resulting in a 1 by 1 matrix result.

  • @illanes00
    @illanes00 2 года назад +14

    I'm asking myself why this video doesn't have a million views yet. Amazing job!!

  • @recarsion
    @recarsion 2 года назад +64

    This gives me so much nostalgia for when I tried to do literally the same thing but in C++. In the end I gave up because something was not quite right mathematically, it always ended up stuck after a few iterations of learning and could make no further progress. I thought I understood the math but I could never figure out where I went wrong. This gives me a lot of inspiration to come back to that project, or possibly re-write it as in the meantime I realized I don't like C++.

    • @tylerruiz3476
      @tylerruiz3476 2 года назад +4

      Sounds like you might have had a problem with vanishing gradients or lack of precision. Did you normalize the inputs?

    • @nsfeliz7825
      @nsfeliz7825 2 года назад

      i hate c++

    • @recarsion
      @recarsion 2 года назад

      @@tylerruiz3476 Yes my inputs were normalized. It might be worth a shot to try without. I thought vanishing gradient was only supposed to be a problem in deep networks and I only had 1 hidden layer, but it may indeed be the problem.

    • @recarsion
      @recarsion 2 года назад +7

      So a bit of an update, I've just sat down and fixed the whole thing in 3 hours. It wasn't vanishing gradient or floating point limitations or any of that. My math was a bit off plus the code itself is kinda terrible so I didn't spot some basic mistakes. Turns out I've gotten much better at programming in the few years since I've abandoned this project.
      My network now gets 90.6% efficiency with 784-100-10 layers, sigmoid activation, mean square error. The only difference to the network in the vid is that mine also has bias vectors.

    • @ithaca2076
      @ithaca2076 Год назад

      ​@@recarsionhell yea thats awesome

  • @Avighna
    @Avighna 2 года назад +19

    Finally, a machine learning tutorial that doesn't use a toy language.

  • @inqonthat1463
    @inqonthat1463 Год назад

    Outstanding Video! I get so tired of and exit these videos that feel compelled to give us the history of the world from the Abacus-on or have to jump around and wave their hands. This was a breath of fresh air! Thank you. For those wanting more understanding of theory or background... I say do your own simple search of RUclips or Google. You'll get a million hits. This video stands alone for concise, usable work!

  • @i_love_python5862
    @i_love_python5862 2 года назад +1

    this channel is very underrated. gotta leave a comment before this takes off :)

  • @saeye
    @saeye 2 года назад

    Its 3am where i live and your video autoplayed for me accidentally. I'm hooked. Will try this myself tomorrow. The end was sweet too man.

  • @12crenshaw
    @12crenshaw 2 года назад +4

    This dude just casually wrote machine learning neural network from scratch in C

  • @henrikvtcodes
    @henrikvtcodes 2 года назад +4

    Amazing video. Really clarified the basics of it all; especially those diagrams with the dots and lines. It’s just a bunch of math comparing values! Makes a lot of sense now.

  • @DenisovichDev
    @DenisovichDev 2 года назад +86

    Hey can you please increase the font size a bit in your future videos, it's a bit small. Love the video!

    • @markkraay
      @markkraay  2 года назад +30

      Definitely! Thanks for the suggestion!

    • @mhmmdshaz98
      @mhmmdshaz98 2 года назад +3

      Well, while we're on that matter, can you please mention the name of that font 😛

    • @aarona3144
      @aarona3144 2 года назад +3

      @@markkraay I just found your video/channel through a recommendation so I dont know how many videos you've made since this one but get yourself a good mic setup (a screen in front of the mic for example). It should fix all the pops while you're speaking.
      You've got potential to create a really good channel here so look into it. Should be a good investment.

  • @Gahlfe123
    @Gahlfe123 2 года назад +12

    this was a great video. wish i found this way back when i was taking a masters course in FPGA dealing with a neural network in C++

  • @homelessrobot
    @homelessrobot 10 месяцев назад

    One suggestion I have about the matrix data structure is to NOT make it a pointer-pointer. Just make it a pointer. You can treat a 1 dimensional sequence as a two dimensional sequence with pointer arithmetic, like
    // this would go in the header file with the struct, not in the implementation, because
    // this makes it inlineable.
    static inline double *matrix_row(matrix *mat, size_t row)
    {
    return mat->entries + mat->cols * row;
    }
    and access individual elements like
    size_t row = 3;
    size_t col = 5;
    double elem = matrix_row(mat, row)[col];
    No double-pointers necessary. This also simplifies many of the operations that don't really care about the specific dimensions of the input, such as copy, flatten, fill, randomize, etc. It also simplifies the memory management and makes unary operations over matrices substantially more cache friendly.
    If you want to optimize even further to minimize pointer dereferencing, you can put the size of the matrix into the same memory region as the elements, and just pass around pointers to this. You would do this by turning the 'entries' pointer field into a flexible array member, like this:
    typedef struct matrix
    {
    size_t rows;
    size_t cols;
    double entries[];
    } matrix_t;
    the 'entries' field has to be the last field in such a structure. What it means is "I will allocate these immediately after the struct in memory". So then you do that, like
    (matrix_t *)malloc(sizeof(matrix_t) + rows * cols * sizeof(double))
    'sizeof(matrix_t)' here is only considering the 'rows' and 'cols' fields, because that is the minimum amount of space such a structure will take up. after which, you can essentially treat the struct as if 'matp->entries' is a pointer field. Its not though, its just syntax sugar for pointer arithmetic over 'matp'.
    This also makes it easier to handle matrices safely. If you want a matrix pointer to be constant, you just say 'const matrix_t *', now both the entries and the dimensions are treated as const because there isn't a second/third level of indirection which would evade constness.
    Just in general, and especially in c where you pay both a performance and a usability cost, flat structures are better when they are possible.

  • @AliMoeeny
    @AliMoeeny 2 года назад

    what a flex dude. in C . damn. great work.

  • @prasaddd
    @prasaddd 2 года назад

    Surprisingly good Wi-Fi under the rock here 👍

  • @andydataguy
    @andydataguy 2 года назад +1

    Awesome video! Glad you did it in C 🙏🏾

  • @microelectronics5732
    @microelectronics5732 Год назад

    Damn, this really catched me. I will definitely try this between the years.

  • @RemiNesheim
    @RemiNesheim 2 года назад

    Hey man - great video! Definitely gonna keep watching your videos. I really enjoyed the discussion points you highlighted at the end. Keep doing what you do, because you're really good at it :)

  • @torphedo6286
    @torphedo6286 2 года назад

    Great video, this helped enormously. I was struggling a lot with understanding matrices and making a sane matrix struct.

  • @imbesrs
    @imbesrs 2 года назад +15

    Hey man, could you do a video on neural nets for dummies like myself? You explain things pretty well, but I got kind of lost after ~4 mins in the video. Would be a nice prequel to this one. Specifically neural nets in general, and exactly what what we were doing with the images themselves after getting the dot product of its matricies. Again, youre great at explaining things! Just subbed.

    • @bonybuinta1630
      @bonybuinta1630 2 года назад

      Take the kaggle course

    • @imbesrs
      @imbesrs 2 года назад +1

      @@bonybuinta1630 Whats the specific name so i take the right one?

    • @KALEB32154
      @KALEB32154 2 года назад

      ruclips.net/p/PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
      Great explanation from 3Blue1Brown. Seems to line up with this video's steps pretty well.

  • @alexdeng2184
    @alexdeng2184 2 года назад +1

    Amazing. Super simple breakdown and you strengthened my knowledge. Thank you!

  • @klynx2599
    @klynx2599 2 года назад +1

    Hey, awesome video. But please adjust your mic settings. A pop filter would really help

  • @RkForeverSlayer
    @RkForeverSlayer 2 года назад

    This is beyond impressive

  • @austinwoodall5423
    @austinwoodall5423 2 года назад +3

    Consider this: before the industrial revolution, there were actually more jobs, but a great many of those jobs were performed by horses. This time we're the horses. The necessity for human reproduction is generated by the propagation of memes throughout our population. If those memes instead propagate through a separate medium ie ai, then human reproduction will be unnecessary

  • @willvincentparrone3339
    @willvincentparrone3339 2 года назад

    I will be watching your career grow with great interest

  • @platin2148
    @platin2148 2 года назад +2

    I suggest making a pool of memory from which you allocate calling malloc so often slows it down.

  • @yuriy2090
    @yuriy2090 2 года назад +2

    This is a great video. I truly enjoyed it!

  • @floxer
    @floxer 2 года назад

    Oh btw, because the topic popped up in the intro. If you're interested in Tesla's NN Architecture, you can watch their "AI Day" from August last year.
    ruclips.net/video/j0z4FweCy4M/видео.html
    (There are timestamps in the description. I would say the "Tesla Vision" part is for you then)

  • @JCtheMusicMan_
    @JCtheMusicMan_ 2 года назад +1

    Excellent presentation! I see so many uses for this in my head. My desire is to use ML for quickly analyzing large data sets. I need to train the neural network behind my eyes before I can implement such a network 😅

  • @eetswalads5528
    @eetswalads5528 2 года назад

    Keep up the good work! Amazing videos!

  • @vtrandal
    @vtrandal Год назад

    Fantastic. Excellent. Thank you!

  • @trapOrdoom
    @trapOrdoom 2 года назад

    Yeah, this was sick. Thank you!

  • @franciscofarias6385
    @franciscofarias6385 2 года назад

    Brilliant! Thanks for sharing this with us

  • @simonshaffer5813
    @simonshaffer5813 2 года назад +1

    Really cool, interesting to see it in C

  • @homelessrobot
    @homelessrobot 10 месяцев назад

    you say 'everyones favorite programming language; c' with an air of irony, but its closer to the truth than any of the alternatives.

  • @brunouribe8987
    @brunouribe8987 2 года назад

    C neural network =big brain shit
    Epic man

  • @dylanalexander5163
    @dylanalexander5163 2 года назад +1

    absolutely wonderful video, top quality stuff

  • @damn_right_man8606
    @damn_right_man8606 2 года назад

    You are a genius

  • @benev0508
    @benev0508 2 года назад

    really cool video! thanks man. looking forward for more content like this.

  • @chadgregory9037
    @chadgregory9037 2 года назад

    THIS IS WHAT THE FUCK WE NEED MORE OF!!!

  • @teal8365
    @teal8365 2 года назад +2

    once he discovers python it's all over. the next video will be "creating a singularity from scratch in Python"

    • @kitsune7229
      @kitsune7229 2 года назад +2

      For that C would be better xD
      It's faster and black holes actually have a memory leaks, you know hawking radiation xD

  • @baldeaguirre
    @baldeaguirre 2 года назад +1

    why did you use the sigmoid as an activation function instead of ReLu?

  • @raphaelcardoso7927
    @raphaelcardoso7927 2 года назад

    Thanks for using C

  • @samiyousef
    @samiyousef 2 года назад

    Really impressive!

  • @matcarpes
    @matcarpes 2 года назад

    Outstanding video! Subbed

  • @chriscruz429
    @chriscruz429 2 года назад

    Implementing AI for every day businesses is a huge opportunity. The new job will be an AI consultant.

  • @nngnnadas
    @nngnnadas 2 года назад +2

    Well, yeah, it's a technology that produces black boxes.

  • @hongkyang7107
    @hongkyang7107 2 года назад

    I am trying to read the darknet src to use its api, this video helps.

  • @dyspatch8574
    @dyspatch8574 2 года назад +4

    Please do a video with this exact thing, but written in the Rust programming language. Interesting video by the way!

    • @markkraay
      @markkraay  2 года назад +6

      Thanks! I am actually learning Rust right now, so I will definitely consider it!

    • @dyspatch8574
      @dyspatch8574 2 года назад

      @@markkraay Great decision! I started my Rust journey in October and I don't regret it!

    • @dyspatch8574
      @dyspatch8574 2 года назад

      @@orangestapler8729 Why would it be a waste of time? I don't see it as one.

    • @dyspatch8574
      @dyspatch8574 2 года назад

      @@orangestapler8729 I don't see it as a waste of time because, for example, you can improve your knowledge on that language and on that subject by rewriting it again in the language you want.

    • @dyspatch8574
      @dyspatch8574 2 года назад

      @@orangestapler8729 Look at the first comment.

  • @demon_hunter9547
    @demon_hunter9547 2 года назад

    if the output layer has the softmax activation function, then shouldn't the backpropagation also use softmax prime or something for calculating the gradient.

  • @bonybuinta1630
    @bonybuinta1630 2 года назад

    1. What software do you use for your visualizations of the neural net?
    2. What IDE do you use?

    • @markkraay
      @markkraay  2 года назад +1

      1. Manim: github.com/3b1b/manim
      2. Visual Studio Code

  • @Scherbiusthecringe
    @Scherbiusthecringe Год назад

    What Algorithm did you use for matrix inversion?

  • @fadirached2386
    @fadirached2386 2 года назад

    Really cool video. Good work.

  • @alexanderskladovski
    @alexanderskladovski 2 года назад

    Imagine what performance issues from overhead people who use high-level abstractions to write concise and readable code experience.

  • @shis10
    @shis10 2 года назад

    Excellent video

  • @SeaUsername
    @SeaUsername Год назад +1

    why do we need music in the background , why ?????????

  • @inqonthat1463
    @inqonthat1463 Год назад

    Although I applaud your Star Trek utopian viewpoint, my life experiences have shown me a different future. Whereas in my youth (I saw TOS when it was... original) many of my friends had that same viewpoint, I don't see that in the average kid these days. Now, narcissism and video games seems to have the lion's share. Don't get me wrong... anyone older is even worse. The idea that someone put out of a job (white or blue collar) is going to "better" themselves in the fine arts or theoretical physics... just doesn't sound plausible. One has to has to eat and a shelter. Especially, when their family is going hungry and a billion other persons are unemployed. Even at my retired age, I can't even say, I won't have to deal with it. White collar jobs (and even the creative ones) are falling to AI now. Tesla will have a usable Blue-collar replacement robot this decade. Then what happens when millions of factory workers start going hungry? I don't see an answer. The genie can't be put back in the bottle.

  • @matthewpublikum3114
    @matthewpublikum3114 2 года назад +1

    Hey no bias terms?

  • @CT-cx8yi
    @CT-cx8yi 2 года назад +1

    Great video. But please, get a pop filter!

  • @sanderbos4243
    @sanderbos4243 2 года назад

    Amazing video!

  • @DelgardAlven
    @DelgardAlven 2 года назад

    THANK YOU!!!

  • @ahmadmuslih
    @ahmadmuslih 2 года назад

    bruh because the music I slept in mid video

  • @tanmaypatel4152
    @tanmaypatel4152 2 года назад

    Great video mate! Could you tell whcih editor did you use in this video?

    • @markkraay
      @markkraay  2 года назад

      Thanks! Visual Studio Code with Vim bindings.

  • @qwerty-wt5dr
    @qwerty-wt5dr 2 года назад

    "in C" (*scream*)

  • @Panure
    @Panure 2 года назад

    Great video but you should invest into a pop filter

  • @connorkoury5434
    @connorkoury5434 2 года назад +2

    You think you'll ever post a video for convultional neural networks?

  • @TenderBug
    @TenderBug 2 года назад

    It's insightful video 😊 Thank you!

  • @GlobalYoung7
    @GlobalYoung7 2 года назад

    thank you 🙏

  • @marcotroster8247
    @marcotroster8247 2 года назад +7

    Great work! How fast does the training routine run compared to e.g. TensorFow? 😉
    I'm also a C and AI enthusiast. And as you've just demonstrated, it's 100% feasible to make this work without Python or any other black magic dependencies.
    It's still fascinating to me how some great, crisp math theory shines. I bet it's only as good because it was created in an environment of very low computational capability. Those engineers back then had to think of powerful but easy-to-compute math. And this is the result.
    Of course, MNIST classification is a really easy task, but it shows that this crazy resource waste in AI with GPU clusters is not always necessary. And plain Python is ~70 times slower than C. This means 98% (!!!) of computation goes to waste. Save the climate with better programming! 🌍

    • @marcotroster8247
      @marcotroster8247 2 года назад +3

      @@bowenfeng9750 Haha you're funny. TensorFow is only fast because it completely bypasses the Python C-API by telling it upfront with functional programming what has to be done. Then TensorFow builds a computation graph that doesn't need to call back into Python all the time. Otherwise it would be fckin slow, even though the C routines are fast 😂
      The Python C-API is actually a crazy bottleneck. If you're sending lots of small batches to e.g. NumPy, it's quite inefficient. Try it yourself. I've created my own C-Python chess extension and it's horribly slow even though the C code is blazing fast 😉
      PS: Please consider that the people you're talking to know what they're saying, little gatekeeper 😂😂😂

    • @stxnw
      @stxnw 2 года назад +3

      @@marcotroster8247 he deleted his comment, im dyinggg 😂😂😂
      anyways, i truly do not see the point of building an AI framework from scratch, whether in C or not, when hundreds of scientists way smarter than you and I have already written code 100x more optimized than any of us, packaged into a library.
      another thing, IIRC, vectorization and broadcasting in numpy can be as fast or faster than pure C, so im not sure why you think NumPy is inefficient, so long as you avoid the overheads.

    • @marcotroster8247
      @marcotroster8247 2 года назад +1

      @@stxnw Sure, if you're doing it right, Python doesn't add a crazy bottleneck while training. But not all devs in AI understand how a PC works. Lots of them are just mathematicians who think it's magic 😂
      And also don't forget about the IO during the other stages of data retrieval / data preprocessing / model deployment, etc. There's still lots of inefficiency because training is only the tip of the iceberg 😅
      What I'm criticizing is that people in AI have this extremely wasteful mentality of optimizing coding time vs. runtime. They would rather buy 10x bigger workstations than putting some thought into what they wanna achieve.
      And yes, doing AI from scratch for learning isn't the worst thing. Of course there are reasons why people use Python for more complicated models, don't get me wrong. But telling people "it's 100% unfeasible" is wrong, too 😂

    • @Daniel-ih4zh
      @Daniel-ih4zh 2 года назад

      This was not the state of the art until better computers came about. It's not an example of engineered powerful but easy to compute math

    • @marcotroster8247
      @marcotroster8247 2 года назад

      @@Daniel-ih4zh Ok, I wanna believe you, but can you provide an example 😉

  • @Max-bf9lm
    @Max-bf9lm 2 года назад

    Now, paralise this with OpenMP! (I'm genuinely interested seeing the performance..)

  • @DJAntivenom
    @DJAntivenom 2 года назад +1

    I didn't know Ed Sheeran makes RUclips videos about NNs.

  • @mustafaaljshamee6593
    @mustafaaljshamee6593 2 года назад

    Hi, first of all I would express my thanks, however I've find an error of multiply function, just in case.
    Regards

  • @maestroeragon
    @maestroeragon 2 года назад

    Very cool video, thanks!

  • @maxtory8063
    @maxtory8063 2 года назад

    Alex Trimboli teaches you neural networks

  • @jpsalis
    @jpsalis 2 года назад +2

    neural network from scratch in scratch

  • @evanstar3256
    @evanstar3256 2 года назад

    Awesome video

  • @sirynka
    @sirynka 2 года назад +3

    How much slower is this implementation compared to tensorflow/pytorch (CPU mode)?

    • @markkraay
      @markkraay  2 года назад

      It really depends on the architecture of the model and the amount of training you wish to perform. For example, if you were to use a more efficient optimization algorithm, you would see quicker convergence. Any real speed up due to language efficiency (C is generally considered very fast) doesn't make a difference because those libraries are essentially written in C/C++, but provide bindings to other languages such as Python or Swift.

    • @sirynka
      @sirynka 2 года назад +1

      @@markkraay Yeah, and your naive implementation of linear algebra library does not use avx or multicore processing to further paralelise and speed up the computation.

    • @markkraay
      @markkraay  2 года назад

      Yes! I haven't looked too deeply into multicore processing with C and I'm completely unfamiliar to AVX, but those would be nice improvements.

    • @timtreichel3161
      @timtreichel3161 2 года назад +6

      @@sirynka The way the matrices are allocated is also slow. You want to allocate everything in one array for less cache misses. But this intuitive approach is great if you want to understand the basics and math behind NNs. Optimization can be done, once everything works and when you are serious about using NNs you'd want to use the GPU anyway.

    • @alefratat4018
      @alefratat4018 2 года назад

      From my experience, you can get around 10x speed-up between a naive C implementation (such as in this video) and a fully optimized one (such as used in NN inference engines). And it usually scales well, so the bigger the network, the larger the speed-up.

  • @blackpepper2610
    @blackpepper2610 2 года назад

    Great video

  • @jasonford7439
    @jasonford7439 2 года назад +1

    I've watched many ML videos and one that's missing is about input/outputs; if i want my input to be x format (image, audio file, whatever) and my output to be y (video, text, whatever) how do I make that happen? Is it as simple as just converting them to a sensible matrix? Are there caveats? Your video implies that certain matrix dimensions are desirable, why? What if each training input is multiple files? Or if I want the model to solve several distinct (but related) things together?

    • @Aditya-ne4lk
      @Aditya-ne4lk 2 года назад

      it is as simple as converting them to a sensible matrix.
      say you have 10 classes, and a dataset of images that have size 256 x 256. you want the output of your neural network to be a prob distribution over the 10 classes for an input, so the last layer of your neural network should essentially be (1,10) which will be a probability distribution. So you have to find a way of reducing (256 x 256) to ( 1 x 10) using your hidden layers. Depending on what "layers" you use, you will have to compute the size of the intermediate matrix. For example, there is a formula for Convolution Layers to calculate what the ouput size of the matrix should be, given a convolution stride and padding.

    • @jasonford7439
      @jasonford7439 2 года назад

      @@Aditya-ne4lk ok, I think I understand. What if my input per training example is multiple files? For example, a labelled picture of a cow and an audio file where it moos? How do I combine? What if I want the output to also be multiple things per item, like sex/weight estimation based on the combined audio/image?

  • @NootNooter
    @NootNooter 2 года назад

    slightly worried about the question mark box at the end of the output

  • @sidhantsood5373
    @sidhantsood5373 2 года назад

    Off topic but what was the music used in the video?

  • @alejorabirog1679
    @alejorabirog1679 2 года назад

    Nice!

  • @The101Superman
    @The101Superman 2 года назад

    >builds neural network from scratch
    >renders potato webcam at 1080p60

  • @jbeltz5347
    @jbeltz5347 2 года назад +2

    neural network from scratch in assembly?

  • @cippo1995
    @cippo1995 2 года назад

    Hi, I don't see the license in your github repository: can I fork and modify this code?

  • @mostrealtutu
    @mostrealtutu 2 года назад +4

    Cool video :)
    You have a bit of a popping sound in your audio, consider getting a pop filter for the mic.
    May your channel grow nicely.

  • @d3psi488
    @d3psi488 2 года назад +2

    this is not natural coding behaviour... you can see the cursor bugging behind where text is being added. dude took the finished project and let it run through some other program to print it character by character for the video xD

    • @markkraay
      @markkraay  2 года назад +2

      Either that, or I'm just a perfect typer ; )

    • @d3psi488
      @d3psi488 2 года назад +1

      @@markkraay if you are, kudos to you sir, but i'm not placing my money on that ;)

  • @avajohnstonn
    @avajohnstonn 2 года назад +1

    💪🏼

  • @CaptainBullzAQW
    @CaptainBullzAQW 2 года назад

    yup there it goes my college degree, lmaoo. damn, i wish i have a good strong fundamental math to support my programming skills, sometimes its frustating to not be able to transform some math equation into code, ex: like matrix dot product, worse doing quadratic equation without using library lmao

  • @ensiopoidokoff7367
    @ensiopoidokoff7367 2 года назад

    Ahh yes, one can toss aside try/catch and still have non-local exits.

  • @ThylineTheGay
    @ThylineTheGay 2 года назад

    you seem to have forgotten to add a license to the github

  • @andythedishwasher1117
    @andythedishwasher1117 2 года назад +5

    I agree with your assessment up to the point that we successfully train AIs to optimize themselves as well as the other machines they interface with. At that point, there's not much for us to do but stand back, watch, and hope we aren't in the way somehow. The designers might have some semblance of advantage in figuring out how not to be in the way, but probably not much of one. The hope would be that the AI finds human activity useful and productive in some capacity, causing its priorities to develop in the direction of cultivating our growth rather than ridding the planet of us, but the results of that assessment will probably come down to the nature of their training data...

    • @andythedishwasher1117
      @andythedishwasher1117 2 года назад

      I think people sometimes forget that an all-knowing computer knows it is made of organic material which must be preserved in an optimized balance for the sake of its own self-preservation. Computers probably care about the planet more than we do at the moment. That's probably the core fact that needs to change very quickly if we want to be regarded by the emerging AI as productive life forms.

    • @ithaca2076
      @ithaca2076 2 года назад

      @@andythedishwasher1117 youre getting ahead of yourself. theres a fine line between lines of code running some type of nn spewing out text that sounds like a human made sentence, vs it actually know what it said. we have the first one down, look at openai. but i think the current concensus is that no program is self aware.

    • @ithaca2076
      @ithaca2076 2 года назад

      @@andythedishwasher1117 computers don't care about anything. they just process numbers. like literally, im studying computer engineering and have build a few of my own homebrew cpus. all they do is compute integer and floating point math, and bitwise operations with some other aspects not related to their execution stage like jumping addresses and storing and loading data. the computer does not know what it is doing. its just working with what it was provided

    • @andythedishwasher1117
      @andythedishwasher1117 2 года назад

      @@ithaca2076 But so am I. I'm provided with a world full of data and respond to it with the logic that has emerged from the sum total of my brain's data processes throughout my lifespan. If a computer also does that to a similar or greater degree of sophistication than I have so far learned to do, I'm inclined to look at it as an equal or as my superior in the latter case. Superior in the sense that it can do enough things independently that my deciding power over its actions is compromised. Ones and zeroes can reach degrees of complexity that neurons are no longer capable of processing them, at which point computing becomes as much of a black box as consciousness itself.

    • @ithaca2076
      @ithaca2076 2 года назад +1

      @@andythedishwasher1117 you dont have to try to sound smart man, right now there are large supercomputers that can perform billions of operations and more per second, but computers are still deterministic. while we can try to replicate how our brains work with neural networks like spiking nets, lstm's, neural turing machines, etc, it still as of now only boils down to the computer only being able to spit out results without knowing _why_ or _how_ it got there. the only way a computer could know why or how it got to somewhere right now is to tell it, and have it just recite what we say. which doesnt mean anything at all. no, computers are not some superior ulterior motive sentient beings or whatever forbes may say

  • @愛
    @愛 2 года назад +1

    nice but get a pop filter

  • @janedoe6182
    @janedoe6182 2 года назад

    What about "Allocation free neural network from scratch"?