This video is incredibly helpful!! I am a college student taking a physics lab right now in which we have to fit Gaussians to several data sets. I had no clue what to do until I found this video. Thanks!
Hi. This is an excellent video - thank you. One question I've always had is for a real dataset, how do you know which distribution to choose? Once you know what you're fitting to, the rest is mechanical but making that choice (especially if there's skewness and there's various distributions that could get you the same approximate shape depending on the 2 or 3 parameters) seems tough. Is it a case of trial and error?
is there a way to use solver where i have less data points from my expiriment then the ammount of theoretical data. I only have about 20 values from my experiment but i need to compare it with a set of theoretical values (150 values)... so i can't use the RSS function where i subtract mesured and theoretical
I do have a simple question. I have a very similar scenario of yours. However, every time I change the initial values of a, b and c, I get different final values of a, b and c.
+Atheer Almasri +Conglian Pan I believe this is occurring because you still have a RAND() in your function (and therefore the raw data is changing), your initial values are so far off Excel cannot guess (i.e., use 0.5, 0.7, 0.9 to create the raw data then changing the values to 99, 99, 99 before using the Solver) or you do not have enough iterations (go to Solver=>Options=>Iterations; you can also change other parameters in the option box to try to help out)
You will need to look at the derivative (slope) of the curve and determine when the derivative changes signs. If your dataset has enough data points, you can simply use simply use the SIGN function and look at when it flips from 1 to -1 (for a maximum) or -1 to 1 (for a minimum). For the dataset given in the example, you can use Column D and enter the following equation into D6 “=SIGN((B6-B5)/(A6-A5))”. You can then copy the equation in D6 and paste it from D6:D45. Now you can simply look for when the sign of the number changes; the row directly before this sign change will be the maximum/minimum value.
Thank you so much for your answer sir. But I'm so sorry I don't really get the derivative part. But can I use this SIGN formula directly to determine the peak of the curve?
I have uploaded a modified Excel version to the Google Drive in the link which shows this concept. Columns D-H show this concept (I have moved the graph and other columns over to the right by several columns). This is a very basic example, but by looking at the sign of the derivative in Column D, you can see where the sign changes. This will show either a local minimum or a local maximum. In cell H4, I take the entire maximum of Column E, which will then show the entire function's maximum, even if there are multiple local maximums. Cell G4 then uses a VLOOKUP to find the X value which corresponds with the Y value found in H4. Note to properly use this VLOOKUP, you will need to have Column E be the Y value and Column F be the X value (a VLOOKUP will always require the LOOKUP_VALUE to be listed in the first column of the TABLE_ARRAY; there are other ways of using MATCH/INDEX to allow you to lookup through columns that are on the right side of the TABLE_ARRAY, but these functions tend to be more confusing in my opinion).
+Safa Balekoglu typing in 15 isn't necessary. RAND() provides a random number between 0 and 1, but RAND()/15 only provides a random number between 0 and 0.15. I used RAND()/15 to "minimize" the noise I introduced to the gaussian curve so it still looked like a "normal" gaussian curve. I could have simply used RAND() for this purpose.
+duknoecuzzie no, they are just initial "guesses". If you're using real data, you will need to provide a solution that is somewhat reasonable for the solver to work, otherwise Excel does not have a starting position to converge from.
If you have 2 functions that you have defined with the proper RSS value (cell D2 in the example), then you can simply RSS each of these values again and use the SOLVER on this new cell. I would also add a weighting factor to each RSS value so you can have more control when you are optimizing for each function. You should set the weighting factor equal to 1 at first and then you can increase one weight with respect to the other weight in order to get the best fit for both functions together. For example, if we have the RSS_1 in D2, WEIGHT_1 in D4 (initially set to 1), RSS_2 in H2, and WEIGHT_2 in H4 (initially set to 1), then you can use the formula "=SQRT((D4*D2^2+H4*H2^2)/(D4+H4))" in cell I2. Now you can use the SOLVER on cell I2 and select all the variables you want to change while minimizing cell I2.
Your addition of a random biased y offset does not help fit the the y amplitude, x offset, or x spread. Biased, because it is always a positive y offset between 0 and 1/15. Though, nice attempt.
This video is incredibly helpful!! I am a college student taking a physics lab right now in which we have to fit Gaussians to several data sets. I had no clue what to do until I found this video. Thanks!
Well done! Nice job and very useful and helpful! Thank you.
thanks very much! helped tremendously, now I'm ready for this physics lab
Thank you very much, finally i can finish my homework
¡Fantástico! Muchas gracias por publicar este vídeo.
Hi. This is an excellent video - thank you. One question I've always had is for a real dataset, how do you know which distribution to choose? Once you know what you're fitting to, the rest is mechanical but making that choice (especially if there's skewness and there's various distributions that could get you the same approximate shape depending on the 2 or 3 parameters) seems tough. Is it a case of trial and error?
I have same doubt.. please let me know if you find any solution
You can use R's fitdist formula.
Very helpful. If you could also explain how to obtain the errors of the fitting parameter, that would be great
Perfect explanation. Thanks
Thanks for saving me with mine project i had no idea how to do this :)
Thanks for your video... I have a doubt. For practical data what are the values of a,b and c. Aren't they mean and standard deviation...
is there a way to use solver where i have less data points from my expiriment then the ammount of theoretical data. I only have about 20 values from my experiment but i need to compare it with a set of theoretical values (150 values)... so i can't use the RSS function where i subtract mesured and theoretical
Great video. Very well explained. Thank you.
Very good, thank you very much.
Is this the non linear regression method using excel??
Great work :)
thank you very much very interesting and helpful video
Thanks a lot for this video!
I do have a simple question. I have a very similar scenario of yours. However, every time I change the initial values of a, b and c, I get different final values of a, b and c.
+Atheer Almasri yeah I have the same concerns. Do you have any idea why this happens yet?
+Atheer Almasri +Conglian Pan I believe this is occurring because you still have a RAND() in your function (and therefore the raw data is changing), your initial values are so far off Excel cannot guess (i.e., use 0.5, 0.7, 0.9 to create the raw data then changing the values to 99, 99, 99 before using the Solver) or you do not have enough iterations (go to Solver=>Options=>Iterations; you can also change other parameters in the option box to try to help out)
Hi, do you happen to know how to find the peak of a curve like this? so i can find the x axis and y axis value
You will need to look at the derivative (slope) of the curve and determine when the derivative changes signs. If your dataset has enough data points, you can simply use simply use the SIGN function and look at when it flips from 1 to -1 (for a maximum) or -1 to 1 (for a minimum). For the dataset given in the example, you can use Column D and enter the following equation into D6 “=SIGN((B6-B5)/(A6-A5))”. You can then copy the equation in D6 and paste it from D6:D45. Now you can simply look for when the sign of the number changes; the row directly before this sign change will be the maximum/minimum value.
Thank you so much for your answer sir. But I'm so sorry I don't really get the derivative part. But can I use this SIGN formula directly to determine the peak of the curve?
I have uploaded a modified Excel version to the Google Drive in the link which shows this concept. Columns D-H show this concept (I have moved the graph and other columns over to the right by several columns). This is a very basic example, but by looking at the sign of the derivative in Column D, you can see where the sign changes. This will show either a local minimum or a local maximum. In cell H4, I take the entire maximum of Column E, which will then show the entire function's maximum, even if there are multiple local maximums. Cell G4 then uses a VLOOKUP to find the X value which corresponds with the Y value found in H4. Note to properly use this VLOOKUP, you will need to have Column E be the Y value and Column F be the X value (a VLOOKUP will always require the LOOKUP_VALUE to be listed in the first column of the TABLE_ARRAY; there are other ways of using MATCH/INDEX to allow you to lookup through columns that are on the right side of the TABLE_ARRAY, but these functions tend to be more confusing in my opinion).
very clutch, thank you!
why you dived 15 ? you typed there (... rand()/15) where that 15 comes from?
+Safa Balekoglu typing in 15 isn't necessary. RAND() provides a random number between 0 and 1, but RAND()/15 only provides a random number between 0 and 0.15. I used RAND()/15 to "minimize" the noise I introduced to the gaussian curve so it still looked like a "normal" gaussian curve. I could have simply used RAND() for this purpose.
I am appreciate for your assistance, thanks
Are the 0.5, 0.7, 0.9 numbers at the top random?
+duknoecuzzie no, they are just initial "guesses". If you're using real data, you will need to provide a solution that is somewhat reasonable for the solver to work, otherwise Excel does not have a starting position to converge from.
Hi how can i minimize two functions simultaneously in excel?
If you have 2 functions that you have defined with the proper RSS value (cell D2 in the example), then you can simply RSS each of these values again and use the SOLVER on this new cell. I would also add a weighting factor to each RSS value so you can have more control when you are optimizing for each function. You should set the weighting factor equal to 1 at first and then you can increase one weight with respect to the other weight in order to get the best fit for both functions together. For example, if we have the RSS_1 in D2, WEIGHT_1 in D4 (initially set to 1), RSS_2 in H2, and WEIGHT_2 in H4 (initially set to 1), then you can use the formula "=SQRT((D4*D2^2+H4*H2^2)/(D4+H4))" in cell I2. Now you can use the SOLVER on cell I2 and select all the variables you want to change while minimizing cell I2.
Your addition of a random biased y offset does not help fit the
the y amplitude, x offset, or x spread. Biased, because it is always
a positive y offset between 0 and 1/15. Though, nice attempt.
you dont help anyone if you use an ideal dataset and then add error function. use random data to explain because we are using raw data.
SADISTIC not statistics...
God im dyeing...my supervisor told me to fit a "zero truncated distn"...like hell i know what that is
*divide