Neural networks tutorial: Fully Connected 7 [Java] - Backpropagation implementation

Поделиться
HTML-код
  • Опубликовано: 20 дек 2024

Комментарии • 104

  • @ryangloff6671
    @ryangloff6671 7 лет назад +16

    Great video! Best explanation I have seen on RUclips by far. I've been trying to get my neural net to work for a while and I'm glad you decided to make this series. Can't wait for the rest of it to come out.
    P.S. Don't worry about the way you say things. It's easily decipherable :)

    • @finneggers6612
      @finneggers6612  7 лет назад +1

      Thanks a lot :)
      Next thing will be learning multiple [Input + target] sets.
      Then I will make the network learn the mnist data set.
      I know your feeling, When I started I had no idea what to do, it took me months for finishing my first simple feed forward network. Now I could code one in less than an hour probably.
      But I am stuck right now as well. I am currently making a library for neural networks including convolutional neural networks. Everything works, except the convolution itself. I know the solution but for some reason, it seems to be wrong....
      If you are so interest I can give you a little hint on how the rest will be done if you want to try to code it yourself :)

  • @etopowertwon
    @etopowertwon 5 лет назад +2

    Great video. Story time: I had troubles but as my network for some reason worked fine when I used eta instead of -eta and I spend a day figuring it out.
    I coded the whole thing in C# and Rider IDE suggests naming private variables _likeThis.
    As it turned out in evaluation function I mistyped "- weight" which was autocorrected to "- _weight",
    so naturally results were inverted.

  • @likeyou3317
    @likeyou3317 7 лет назад +2

    Damn it's incredible that it's working!

    • @finneggers6612
      @finneggers6612  7 лет назад +1

      I agree. Just by changing the values of some weights by simple formulas we can have a small network learn different datasets... it’s truly amazing

  • @chameleonchamlee2551
    @chameleonchamlee2551 Год назад

    love you, needed this soo much!

  • @sin_maminoy_podrugii
    @sin_maminoy_podrugii 5 лет назад +1

    Excellent video! Very useful!

  • @illegalal15
    @illegalal15 5 лет назад

    I don't know if you still check these comment sections but I'm really looking forward to learning neural networks and this helped greatly!
    Sadly, I'm having an error that maybe you could fix, my output is giving me numbers extremely close to zero where the target is zero, and it still gives me numbers similar to that on ones where the target is one, it looks like this, where the target values are {0, 1, 0, 0}
    [9.126620038568123E-4, 0.02154651878603702, 9.111057499570657E-4, 9.091304009251781E-4]

    • @finneggers6612
      @finneggers6612  5 лет назад

      Due to the sigmoid activation function that is clamped between 0 and 1 (and never reaches that). It is impossible to reach 0 or 1 as an output. So 9 * 10^-4 is an output that is expected for an target output of 0.
      However the target value of 1 should produce something like 0.9999....
      Feel free to send me your code and I will take a look :)
      And yes, I still check all these comments and try to answer every single of them. Just a bit busy atm. but I will continue doing some videos. Just need some advice on what topic I should make some videos. Maybe Q-Learning, minimax algorithm (chess) or something like that.

  • @Blocksorz
    @Blocksorz 7 лет назад +2

    Hi,
    So I've been going through this and everytime I run the network I'm getting values that are way below my target values, as far as I can tell my code is identical, can you think of any reason why this is happening? cheers

    • @finneggers6612
      @finneggers6612  7 лет назад

      Blocksorz well there might be a lot of reasons. Try changing your initial weights.
      If that does not work you can send me your code and I will check it :)

    • @finneggers6612
      @finneggers6612  7 лет назад

      So did it work for you? Actually, I am not sure if I talked about this topic: Think about a network with only one weight but the error looks like a high polynomial function. (Ofc. you have way more weights but it easy to visualize like this)
      So lets say there is a glob. min. at weight = x = 1 but your initial weights are randomly picked between 2 and 3. If you would drop a ball there, it might happen that this ball would not find the glob. minimum but a loc. minimum (e.g. at 2.5)
      prntscr.com/gtzwod
      Here you need to ignore the axis and so on (error function is never negative). But lets assume that your weight is choosen somewhere between these 2 red lines. By using gradient descent (backpro algo.) we are going to reach the loc minimum (marked with an arrow) but never reach the glob. minimum that is to the right.

    • @Blocksorz
      @Blocksorz 7 лет назад

      hi, thanks for replying and sorry for the delay on my part. I changed a few things about and am now getting more reasonable values although they are all around 0.6 despite the target being 0,1,0,0. thank you for the explanation on global minimums and local minimums as well, it was veyr insightful :)

    • @finneggers6612
      @finneggers6612  7 лет назад

      Depending on your network architecture that can happen but it is very unlikely for only one dataset to have such a high error.
      Can you tell me your initial weights, your layer sizes and your inputs and I am going to check if I get the same problem? If not, you probably made a mistake when copying the code :)

    • @finneggers6612
      @finneggers6612  7 лет назад

      You must not forget that when you usually train a dataset, the backprop. algorithm is applied like 10000 times or sth. like that (implemented in the next video, I think)

  • @awesome-xl4lg
    @awesome-xl4lg 5 лет назад

    my sigmoid function is returning 0 and 1 exactly any ideas on what the problem might be?

    • @finneggers6612
      @finneggers6612  4 года назад

      awesome101 no idea without seeing your code. But check if your code is the same as mine.

  • @colinwilliams3459
    @colinwilliams3459 5 лет назад

    I know this video was uploaded a while ago but could you post/provide a git hub repository for us to use? Thanks!

    • @finneggers6612
      @finneggers6612  5 лет назад

      Colin Williams at that time I wasn’t into github. I uploaded the code to a different cloud. If there is no download link in this video, I think there is in the next one. I am sorry for the inconvenience but I am on my phone :/

  • @trashmobile3782
    @trashmobile3782 5 лет назад +1

    First I just wanted to say thanks for the great video series. So my program works just like yours and the output is always very close to the target values. However, when I run multiple inputs through the network for training (i train one and then train another with a simple nested for-loop) the output isn’t what i would like. For example, i just wanted to test out the network by training it on inputs such as 0.1 with an expected value of 0.2 (just a bunch of inputs with the targets double the value) but after training when I would enter a new value such as 0.25, expecting to get 0.5, the output would always be basically the same as the output from the last training case. Sorry if this is intended because i’m not exactly sure how neural networks work.

    • @trashmobile3782
      @trashmobile3782 5 лет назад +1

      To clarify, basically what i meant was that once i train the network on a data set, the output would remain the same regardless of the new inputs

    • @finneggers6612
      @finneggers6612  5 лет назад +1

      @@trashmobile3782 Okay that's actually a huge topic you opened there and there could be many reasons.
      1. What's your network size?
      2. What's your input/output data?
      3. What's your initial weights.
      Usually, you want to train all the data at the same time, something like this:
      for(int i = 0; i < some number; i++){
      train_first_data_set()
      train_second_data_set()
      train_third_data_set()
      ...
      }
      The reason for that is, that a network can "override" what it has previously learned. To avoid this, you want to train all the data at the same time. So if you used backprop on one dataset, do it for all the others aswell and only then you can start at the beginning.
      If this doesn't fix your problem, feel free to send me your code via my private email (you can find it on my YT-Acc)

  • @somaysharma3295
    @somaysharma3295 2 года назад

    Finn my guy please I need help, I literally copied you code step by step and everything I want to reduce wont reduce and everything I want to increase wont increase.

  • @karapelerin61
    @karapelerin61 6 лет назад

    Thanks for this video. We find best biases with minimum errors after training. So, how can I test a data set that not used in training. How can I use these best weights after training?

    • @finneggers6612
      @finneggers6612  6 лет назад

      i am not sure what you are asking but I guess you are asking "how to save the weights". Check out video 10 :)

    • @karapelerin61
      @karapelerin61 6 лет назад

      @@finneggers6612 I am asking how to use last weights for another data set that not used in training.

    • @finneggers6612
      @finneggers6612  6 лет назад

      After your program stopped or in the same program?

    • @karapelerin61
      @karapelerin61 6 лет назад

      In the same program, I actually want to know how can I test another input and output pair the final weights.

    • @finneggers6612
      @finneggers6612  6 лет назад

      then: after your training is done, just call calculate(input) on the new input data and check the output data.

  • @amarine_art
    @amarine_art 6 лет назад

    Hi, I tried to train the network with 3 different input arrays (each has 21 element in it) and the network is outputting the average values of all outputs. Any idea why? I tried to use different size of hidden layers and reduce the number of hidden layers but there is no difference.

    • @amarine_art
      @amarine_art 6 лет назад

      In addition: With your example values the network works perfectly
      My input values:
      1.
      [0.093590909 0.101281621 0.081175889 0.063991107 0.080633399 0.075176877 0.121766798 0.086863636 0.072614625 0.072859684 0.070606719 0.086630435 0.057466403 0.022576087 0.113220356 0.082048419 0.196905138 -0.013297431 0.165442688 0.100578063 0.057640316]
      Target: [1,0]
      2.
      [0.078590909 0.086281621 0.066175889 0.048991107 0.065633399 0.060176877 0.106766798 0.071863636 0.057614625 0.057859684 0.055606719 0.071630435 0.042466403 0.007576087 0.098220356 0.067048419 0.181905138 -0.028297431 0.150442688 0.085578063 0.042640316]
      Target: [0,1]
      3.
      [0.108590909 0.116281621 0.096175889 0.078991107 0.095633399 0.090176877 0.136766798 0.101863636 0.087614625 0.087859684 0.085606719 0.101630435 0.072466403 0.037576087 0.128220356 0.097048419 0.211905138 0.001702569 0.180442688 0.115578063 0.072640316]
      Target: [0,1]
      And the results:
      [0.3105884485361328, 0.6895619837174333]
      [0.3105884485361328, 0.6895619837174333]
      [0.3105884485361328, 0.6895619837174333]

    • @finneggers6612
      @finneggers6612  6 лет назад

      I've seen this problem befor, you did everything correct but notice that the network.calculate(double[] input) returns the output values array.
      If you do multiple calculations, it will always return the identical array. So the values in your array will change when you run the calculate method again.
      I forgot about this in the video. The solution is:
      instead of returning the output array itself, create a copy and return that.

    • @amarine_art
      @amarine_art 6 лет назад

      well
      System.out.println(Arrays.toString(net.calculate(input)));
      System.out.println(Arrays.toString(net.calculate(intruderinput1)));
      System.out.println(Arrays.toString(net.calculate(intruderinput2)));
      This is do the thing :D

    • @Евгений-ч9к2ф
      @Евгений-ч9к2ф 6 лет назад

      @@finneggers6612, help me please with that!

    • @finneggers6612
      @finneggers6612  6 лет назад

      "instead of returning the output array itself, create a copy and return that. "@@Евгений-ч9к2ф
      use this:
      if (input.length != this.INPUT_SIZE) return null;
      this.output[0] = input;
      for (int layer = 1; layer < NETWORK_SIZE; layer++) {
      for (int neuron = 0; neuron < NETWORK_LAYER_SIZES[layer]; neuron++) {
      double sum = bias[layer][neuron];
      for (int prevNeuron = 0; prevNeuron < NETWORK_LAYER_SIZES[layer - 1]; prevNeuron++) {
      sum += output[layer - 1][prevNeuron] * weights[layer][neuron][prevNeuron];
      }
      output[layer][neuron] = sigmoid(sum);
      output_derivative[layer][neuron] = output[layer][neuron] * (1 - output[layer][neuron]);
      // output[layer][neuron] = sum > 0? sum: 0.01*sum;
      // output_derivative[layer][neuron] = sum > 0? 1: 0.01;
      }
      }
      double[] out = new double[this.OUTPUT_SIZE];
      for(int i = 0; i < out.length; i++){
      out[i] = output[NETWORK_SIZE-1][i];
      }
      return out;

  • @kpan
    @kpan 6 лет назад +1

    Hello, first of i want to say thank you for the amazing videos you make.
    I have run into a problem, at the end the results i get are: [0.7109659288665475, 0.7110625821985218, 0.7110955482357135, 0.7109424722287235], as you can see they are pretty much all equal....any ideas?

    • @finneggers6612
      @finneggers6612  6 лет назад +1

      There could be a lot of reasons for that? Did you make sure that you are running the backprop algorithm like 10000 times and not just once?

    • @kpan
      @kpan 6 лет назад

      Finn Eggers Yeah I did, I think it's optimising to a mean value for some reason..... I will take a look at it again today and rely again if I find anything.

    • @finneggers6612
      @finneggers6612  6 лет назад

      Π. Καράπας if you don’t find the reason, you can straight up send me your code and I will have a look this evening :)

    • @kpan
      @kpan 6 лет назад

      Can't figure it out....Here is my code (i have changed some names to be more in my style, but it should be self explanatorie):
      P.S. thank you very much, not just for me but for answering and helping everyone that has been askng questions. You got a subscriber!
      import java.util.Arrays;
      public class Network {
      private final int[] NETWORK_LAYER_SIZES;
      private final int INPUT_SIZE,OUTPUT_SIZE,NETWORK_SIZE;
      private double[][] output;
      private double weight[][][];//Dimensions: Layer, neuron, connected neuron
      private double[][] bias;
      private double[][] error;
      private double[][] outputDerivative;//output through the derivative of the sigmoid function
      public Network(int... NETWORK_LAYER_SIZES) {
      this.NETWORK_LAYER_SIZES = NETWORK_LAYER_SIZES;
      this.INPUT_SIZE = NETWORK_LAYER_SIZES[0];
      this.NETWORK_SIZE = NETWORK_LAYER_SIZES.length;
      this.OUTPUT_SIZE = NETWORK_LAYER_SIZES[NETWORK_SIZE-1];
      this.output = new double[NETWORK_SIZE][];
      this.error = new double[NETWORK_SIZE][];
      this.outputDerivative = new double[NETWORK_SIZE][];
      this.weight = new double[NETWORK_SIZE][][];
      this.bias = new double[NETWORK_SIZE][];
      for(int i = 0; i < NETWORK_SIZE;i++){
      this.output[i] = new double[NETWORK_LAYER_SIZES[i]];
      this.error[i] = new double[NETWORK_LAYER_SIZES[i]];
      this.outputDerivative[i] = new double[NETWORK_LAYER_SIZES[i]];
      this.bias[i] = NetworkTools.createRandomArray(NETWORK_LAYER_SIZES[i],-0.5,0.7);
      if (i > 0) {
      weight[i] = NetworkTools.createRandomArray(NETWORK_LAYER_SIZES[i],NETWORK_LAYER_SIZES[i-1],-1,1);
      }
      }
      }
      public void train(double[] input, double[] target, double learningRate){
      if(input.length != INPUT_SIZE || target.length != OUTPUT_SIZE) {
      return;
      }
      calculate(input);
      culcError(target);
      updateWeights(learningRate);
      }
      public void updateWeights(double learningRate){
      for(int layer = 1; layer < NETWORK_SIZE-1;layer++){
      for(int neuron = 0; neuron < NETWORK_LAYER_SIZES[layer];neuron++) {
      double delta = - learningRate * error[layer][neuron];
      bias[layer][neuron] += delta;
      for(int prevNeuron = 0; prevNeuron < NETWORK_LAYER_SIZES[layer-1]; prevNeuron ++) {
      weight[layer][neuron][prevNeuron] += delta * output[layer-1][prevNeuron];
      }
      }
      }
      }
      public void culcError( double[] target){
      for(int neuron = 0; neuron < NETWORK_LAYER_SIZES[NETWORK_SIZE-1];neuron++) {
      error[NETWORK_SIZE-1][neuron] = (output[NETWORK_SIZE-1][neuron]-target[neuron])*outputDerivative[NETWORK_SIZE-1][neuron];
      }
      for(int layer = NETWORK_SIZE-2; layer > 0; layer--){
      for(int neuron = 0; neuron < NETWORK_LAYER_SIZES[layer];neuron++) {
      double sum = 0;
      for(int nextNeuron = 0; nextNeuron < NETWORK_LAYER_SIZES[layer+1];nextNeuron++) {
      sum += weight[layer+1][nextNeuron][neuron]*error[layer+1][nextNeuron];
      }
      error[layer][neuron] = sum * outputDerivative[layer][neuron];
      }
      }
      }
      public double[] calculate(double... input){
      if(input.length != INPUT_SIZE) {
      return null;
      }
      this.output[0] = input;
      for(int layer = 1; layer < NETWORK_SIZE;layer++){
      for(int neuron = 0; neuron < NETWORK_LAYER_SIZES[layer];neuron++){
      double sum = bias[layer][neuron];
      for(int prevNeuron = 0; prevNeuron < NETWORK_LAYER_SIZES[layer-1];prevNeuron++){
      sum += output[layer-1][prevNeuron] * weight[layer][neuron][prevNeuron];
      }
      output[layer][neuron] = sigmoid(sum);
      outputDerivative[layer][neuron] = output[layer][neuron] * (1-output[layer][neuron]);
      }
      }
      return output[NETWORK_SIZE-1];
      }
      private double sigmoid(double x){
      return 1d/(1+Math.exp(-x));
      }
      public static void main(String[] args){
      Network net = new Network(4,1,3,4);
      double[] input = new double[]{0.1,0.5,0.2,0.9};
      double[] output = new double[]{0,1,0,0};
      for(int i = 0 ;i

    • @finneggers6612
      @finneggers6612  6 лет назад +1

      Fixed it:
      public void updateWeights(double learningRate) {
      for (int layer = 1; layer < NETWORK_SIZE; layer++) {
      for (int neuron = 0; neuron < NETWORK_LAYER_SIZES[layer]; neuron++) {
      double delta = -learningRate * error[layer][neuron];
      bias[layer][neuron] += delta;
      for (int prevNeuron = 0; prevNeuron < NETWORK_LAYER_SIZES[layer - 1]; prevNeuron++) {
      weight[layer][neuron][prevNeuron] += delta * output[layer - 1][prevNeuron];
      }
      }
      }
      }
      In the first loop, you were only running till the last layer - 1. But we also want to update the weights of our last layer. We just cant start with the first layer because there are no weights (inputlayer = buffer for input data)

  • @Tschipp
    @Tschipp 7 лет назад

    I've got something similar working, but my problem is that the network forgets previously learned data sets. What I'm doing is:
    Loop 1000 times:
    adjust weights for set 1
    end loop
    Loop 1000 times:
    adjust weights for set 2
    end loop
    etc...
    This doesn't seem to be doing the trick.
    Should I be doing this?
    Loop 1000 times:
    adjust weights for set 1
    adjust weights for set 2
    etc...
    end loop
    Could the issue also be the amount of neurons? I'm using 9 input, 30 hidden and 9 output neurons. I'm only using one hidden layer.

    • @finneggers6612
      @finneggers6612  7 лет назад

      Well yeah.
      What I usually do when I have multiple datasets is to merge them into one. Then I can apply my stochastic gradient descent. (It's just that thing that n random input and output values will be extracted each time I train the network.
      I cannot really tell if your network is to small.
      It totally depends on the size of your data and the size of your datasets :)

  • @spassthd8406
    @spassthd8406 4 года назад

    Did you notice, that for every video in this series you decreased your audio volume by a tiny bit?

  • @tobiascoombs288
    @tobiascoombs288 6 лет назад

    Hello Finn, I have been following the series so far up until the point between the end of this video and the start of the next one. For some reason when I try with more then one input and target, the network only trains to the last input. So if i do something like this:
    Network skynet = new Network(4,1,3,2);

    double [] input = new double[]{1,.3,.5,.7};
    double [] input2 = new double[]{0,.6,.2,.9};
    //input values
    //output values we want
    double [] target = new double []{.1,.9};
    double [] target2 = new double []{.9,.1};
    for (int i = 0; i < 10000; i ++){
    skynet.train(input, target, .3);
    skynet.train(input2, target2, .3);
    }
    double [] output = skynet.calculate(input);
    double [] output2 = skynet.calculate(input2);
    System.out.println("input" +Arrays.toString(input));
    System.out.println("input2" +Arrays.toString(input2));
    System.out.println("Target " + Arrays.toString(target));
    System.out.println("Target2 " + Arrays.toString(target2));
    System.out.println("Output "+ Arrays.toString(output));
    System.out.println("Output 2 " +Arrays.toString(output2));
    The network puts out :
    input[1.0, 0.3, 0.5, 0.7]
    input2[0.0, 0.6, 0.2, 0.9]
    Target [0.1, 0.9]
    Target2 [0.9, 0.1]
    Output [0.900000411884334, 0.10000044917638785]
    Output 2 [0.900000411884334, 0.10000044917638785]
    where both outputs are identical and are approximately the second target values every time. If you need me to I can post the code for my entire network class. Thanks for any help you can provide.

    • @finneggers6612
      @finneggers6612  6 лет назад

      So.... I was really shocked because I thought: Okay, probably just some error in the code but when I put your code into my main method I was afraid.
      I got the same result.
      But I found the error.
      I am curious why I did not implement it in a way where this would work because this is a huge flaw.
      If you look closely into the main method, you can see that when we return the output, we simply return the double array of the last layer.
      But Java is OOP. This means that you are referencing the same array. Therefor, using the calculate method does only return the output eventually. For storing the output, you would need to copy that array.
      I highly recommend to write a method that copies the output array and returns that instead.
      You probably got everything right, I just messed up a little bit and forgot to show that we need to clone/copy the output array.
      This would work because we are not storing the array and directly print it:
      System.out.println("input" + Arrays.toString(input));
      System.out.println("input2" + Arrays.toString(input2));
      System.out.println("Target " + Arrays.toString(target));
      System.out.println("Target2 " + Arrays.toString(target2));
      System.out.println("Output " + Arrays.toString(skynet.calculate(input)));
      System.out.println("Output 2 " + Arrays.toString(skynet.calculate(input2)));
      Hope this helps.
      Greetings,
      Finn

    • @tobiascoombs288
      @tobiascoombs288 6 лет назад +1

      Thank you so much!
      I've been getting so frustrated as to why this wasn't working. I changed it to what you suggested and it worked just fine the first time. Now I can continue watching your series. I appreciate the quick reply and if I have any further questions I'll be sure to ask

  • @j-k-l4756
    @j-k-l4756 6 лет назад

    I don't know why, but that doesn't work with my code. The rate of "misses" goes higher with the time. However this is fixed if my biasDelta is not - eta * errorSignals but just eta * errorSignals,
    but I think this is not intended. Could you help me why?

    • @finneggers6612
      @finneggers6612  6 лет назад

      maybe you messed up at the error signal of the output neuron? check that

    • @finneggers6612
      @finneggers6612  6 лет назад

      Check if you got that term with the "-" sign correct.
      I am not totally sure if it needs to be output - target or target - output.
      This could be a reason why your error increases.
      I think I explained that in the previous video at the end. Like the mathematical idea behind that.

    • @j-k-l4756
      @j-k-l4756 6 лет назад

      First, thanks for the fast help.
      So I triple-checked every equation and I should have everything right..
      I tried target - output and that brings the NN for every output to go against 1, which is not what I want at all.
      The creepy thing about this
      double delta = - eta * output[layer-1][prevNeuron] * errorSignals[layer][neuron];
      weights[layer][neuron][prevNeuron] += delta;
      Works fine if
      double delta = eta * errorSignals[layer][neuron];
      bias[layer][neuron] += delta;
      (The deltaBias is initialized after the for-loop)
      And not - eta
      errorSignals[networkSize - 1][neuron] = (output[networkSize - 1][neuron] - targetValues[neuron]) *
      outputSlope[networkSize - 1][neuron];
      (I called outputDerivative outputSlope for some mathematical reasons)
      These are the errorSignals for the last layer
      double sum = 0;
      for (int nextNeuron = 0; nextNeuron < networkLayerSizes[layer + 1]; nextNeuron++) {
      sum += weights[layer + 1][nextNeuron][neuron] * errorSignals[layer + 1][nextNeuron];
      }
      errorSignals[layer][neuron] = sum * outputSlope[layer][neuron];
      And these for the hidden layers

    • @finneggers6612
      @finneggers6612  6 лет назад

      Okay that's weird. Can you send me your code and I will have a look at it?

    • @j-k-l4756
      @j-k-l4756 6 лет назад

      Sorry for not answering that long,
      Do you just need the code of the NN class?
      Or the whole project?

  • @tutkinrannan2184
    @tutkinrannan2184 6 лет назад

    why just not put the eta to 10000 so you get the exact values?

    • @finneggers6612
      @finneggers6612  6 лет назад +2

      It doesnt work that way. Each time we run the algorithm, we calculate the gradient of the function.
      the gradient gives us a hint in which direction we need to walk with our weights so that so that the error decreases.
      the thing is that, next time we run the network, the gradient will change aswell and we will walk towards another direction with the weights.
      Think about this like a treasure hunt where you get hint by hint to eventually find your target. Now if your eta is to big, your steps will be to big and you will overshoot your target.
      If your eta is to small, you will find your target but it will take very long.

  • @oorosh
    @oorosh 6 лет назад +1

    Dear Finn, thanks for this amazing tutorial. I managed to arrive up until here. When I run the code, I get (almost) 1 where 1 was the target and 0.5 where 0 was the target. I'm not able to find what is wrong. Any help would be very appreciated :) Here is my code:
    import java.util.Arrays;
    public class Network {
    public final int[] NETWORK_LAYER_SIZES;
    public final int INPUT_SIZE;
    public final int OUTPUT_SIZE;
    public final int NETWORK_SIZE;
    private double[][] output;
    private double[][][] weights; //1st layer, 2nd neuron, 3rd neuron in previous layer
    private double[][] bias;
    private double[][] error_signal;
    private double[][] output_derivative;
    public Network(int... NETWORK_LAYER_SIZES) {
    this.NETWORK_LAYER_SIZES = NETWORK_LAYER_SIZES;
    this.INPUT_SIZE = NETWORK_LAYER_SIZES[0];
    this.NETWORK_SIZE = NETWORK_LAYER_SIZES.length;
    this.OUTPUT_SIZE = NETWORK_LAYER_SIZES[NETWORK_SIZE-1];
    this.output = new double [NETWORK_SIZE][];
    this.weights = new double [NETWORK_SIZE][][];
    this.bias = new double [NETWORK_SIZE][];
    this.error_signal = new double [NETWORK_SIZE][];
    this.output_derivative = new double [NETWORK_SIZE][];
    for(int i=0; i < NETWORK_SIZE; i++){
    this.output[i] = new double[NETWORK_LAYER_SIZES[i]];
    //this.bias[i] = new double[NETWORK_LAYER_SIZES[i]];
    this.bias[i] = NetworkTools.createRandomArray(NETWORK_LAYER_SIZES[i],/*lower bound*/0.3,/*upper bound*/0.7);
    if(i>0){
    // weights[i] = new double[NETWORK_LAYER_SIZES[i]][NETWORK_LAYER_SIZES[i-1]];
    weights[i] = NetworkTools.createRandomArray(NETWORK_LAYER_SIZES[i],NETWORK_LAYER_SIZES[i-1],/*lower bound*/-0.3,/*upper bound*/0.5);
    }
    this.error_signal[i] = new double[NETWORK_LAYER_SIZES[i]];
    this.output_derivative[i] = new double[NETWORK_LAYER_SIZES[i]];
    }
    }
    public double[] calculate(double... input){
    if (input.length != this.INPUT_SIZE) return null; //if input size is not correct you can't calculate
    this.output[0]=input;
    for (int layer=1; layer

    • @finneggers6612
      @finneggers6612  6 лет назад

      oorosh I will take a look this evening. Could you quickly give me your output of the code. I might be able to find the mistake if you give me the output right now. I am on my phone right now so I can’t run the code. Will check later. But I can try it now if you tell me the output,

    • @oorosh
      @oorosh 6 лет назад

      Here below are some outputs:
      [0.9991460260230022, 0.49999048867611445, 0.4999837869853982, 0.5000001664803813]
      [0.9992033733498885, 0.49998867311186934, 0.499982851186393, 0.4999907293726699]
      [0.9990546936891318, 0.500007692851878, 0.49998310860825856, 0.4999907715307068]
      Thank you

    • @finneggers6612
      @finneggers6612  6 лет назад

      In your feed-forward method:
      take a close look at the derivative :)
      If you fix that one, everything will work.
      d/dx s(x) = s(x) * (1 - s(x)) with s(x) being the sigmoid function.
      Your code has:
      d/dx s(x) = s(x) - (1 - s(x)) (notice the "-" sign in the middle)
      If you are not familiar with this notation I will put it into other words:
      At the point where you calculate the derivative, you've got a "-" sign instead of a "-" sign.

    • @oorosh
      @oorosh 6 лет назад +1

      YES! That was it. Thank you Finn

  • @nirayoshikage9168
    @nirayoshikage9168 5 лет назад +2

    Do you have the source code?

    • @finneggers6612
      @finneggers6612  5 лет назад +1

      AlephNaught you can find it in the next Video i think.

    • @nirayoshikage9168
      @nirayoshikage9168 5 лет назад +1

      @@finneggers6612 ok thank you for telling me :)

    • @nirayoshikage9168
      @nirayoshikage9168 5 лет назад +1

      @@finneggers6612 It have only TrainSet.java. I want the main class. But if you don't have it thats fine

    • @finneggers6612
      @finneggers6612  5 лет назад

      @@nirayoshikage9168 i uploaded the full code in one of the last videos. just go through vid 8,9,10 and you will find the full code.

    • @nirayoshikage9168
      @nirayoshikage9168 5 лет назад

      Oh ok.im sorry

  • @Srinivas_Billa
    @Srinivas_Billa 7 лет назад

    Hi,
    I have made my network and tried training it. giving the input as 1 and 0. and wanting the output to be 0 and 1. The outputs im getting are: [0.9999998749994315, 0.5]. I dont understand why im not able to get the other output below 0.5.
    here is the code
    pastebin.com/YUFQbhmu

  • @NameLast-wm5je
    @NameLast-wm5je Год назад

    pronounciation +dangit... You said "gõringurschaaan" instead of "derivation". In instances such as this you might be causing someone to waste days because they heard you say something incorrect. Annuuuuunciate please

    • @Klabauterking
      @Klabauterking 10 месяцев назад

      Gotta aggree on this one, the pronounciation is rough! If you start making tutorials again, you might wanna check out, how natives pronounce words or simply ask DeepL or something :) Also, why don't you record chunks instead of the whole thing at once. This way you can explain small pieces and make them sound proper instead of getting confused with words and "ähhms" all the time.