Hello, Thank you so much for the Video, it helped a lot. There is just one question for me that still remains unanswered. I didn't understand why you didn't use the semi-variogram function, instead only using the difference between the values of the observations and then squaring them (you only used the right part of the whole function). In the scatter plot only the squared differences were shown. So for what exactly do I need to use the semi-variogram function? What exactly does the y(h)-value mean and for what do I use it? I hope you understand my question! And thank you again for this informative video. Best regards.
Thanks for the video Matt. However, I wonder why you did not multiply the squared difference by the number of pairs and then by 0.5 as you explained in the formula? Could you please explain this in another video? It is not clear which were the lags you used here. Thanks in advance.
Hi Matt, thank you very much for this video. I just have some questions: 1) So conceptually speaking, the less alike our plotted data is to the semivariogram model, the more likely it is that the phenomenom is of random nature? In other words, spatial autocorrelation != random? 2) random !=stochastic? If I'm not mistaken, a random variable is that which can take a value with no dependence on any other observation. However, stochastic processes, while "unpredictable", can be conditioned by adjacent observations, be it in time or space, am I correct? 3) Finally, so the semivariogram can help us to determine at which distance the phenomena being measured stops being spatially autocorrelated and starts behaving randomly? Thank you very much, and sorry for these many questions!
jose holguin you've asked a great question. The squared difference essentially removes the possibility for negative values skewing the variance at a specific distance. This is most important when multiple observation points are the same distance from each other. The corresponding value on the semi-variogram for this distance would be an average of the squared differences. Incorporating negative values when computing the average would skew the variance for these observation points.
Thank you for your video! However, I wonder that if there's something wrong this plot. You just plotted the "squared difference" on Y-axis and "distance" on X-axis, this is not really a variogram. Could you please explain the idea behind it?
There is great confusion in the literature on the usage of variogram versus semi-variogram. It mostly surrounds the prefix, 'semi', meaning half. The way that I like to explain it is that variograms represent the total variance between two observation points at a given distance. Semi-variograms, on the other hand, shift the focus to a single point, meaning that the total variance has been divided evenly between each of the paired points. In other words, if focusing on an individual point, then it has half of the variance associated with it for a given distance between two points (i.e, the spatial lag). You would see in the textbook, Spatial Statistics & Geostatistics - Chun and Griffith, and on the ESRI website that they mention semi-variogram, as they are used in applied practices of geostatistical analyses (desktop.arcgis.com/en/arcmap/latest/extensions/geostatistical-analyst/modeling-a-semivariogram.htm). In both places, you'll notice in the formulas the 1/2 that represent the halving that is present in a semi-variogram.
Great video . Please why didn't you divide the squared distance by N and multiply by N. I presume this is a variogram and not a semi- variogram. The vid has helped me lot to generate a semi variogram using excel for my drill hole data.. I compared to results from surpac mining software and its perfect.
Hello Solomon - I'm glad to hear that the output matched that produced by the Surpac Mining software. In regards to your question, there is some widespread debate on terminology. I found this article that provides further context. It can be viewed at link.springer.com/article/10.1007%2Fs11004-011-9348-3?LI=true.
@solomon ansah, I was also a bit confused. I think he did not multiply by N neither did he divide by N, because N is usually 1 (if you leave out xi-xj or xj-xi). But... The Formula says " (1/2*N) * Sum(h) " -> Didn't Matt forget to multiply by 1/2 ?
Great video! Just wondering. As these points are made of latitude and longitude (coordinates in a non Cartesian coordinate system), does it make sense to use Euclidean distances? Wouldn’t it be needed first to calculate the Cartesian coordinate of the points from their latitude and longitude values? Many thanks!
Hello GranCombo - Thanks for checking out this video. The example is intended to be performed most simply on a flat, planar surface. When one begins calculating distance on Earth's surface, the equations become more complex, because, as you know, the curvature of the Earth must be incorporated in the distance formula. On a Cartesian Plane, Euclidean distance, or straight-line distance, is appropriate. The points in the video are locations on a cartesian plane en.wikipedia.org/wiki/Cartesian_coordinate_system. The example is intended to help individuals better understand what Geostatistical Analyst is doing behind the scenes when it searches for spatial autocorrelation in datasets. You would be absolutely correct that further adjustments to the distance formula would be needed when measuring distance on a spherical object.
TQ Matt. Means that the y-axis is the squared differences and simply called semivariance? or there are a specific formula to generate semivariance for y-axis. its confusing me to plot the y-axis (semivariance) due to many formula when i search for semivariance.
Dr Matt, It's my honor to be your student at BootcampGIS. Your course, "Design a geospatial workflow to plan for smart urban growth" is Phenomenon. 🥰😊
very simple and clear explanation! I used this for the teaching with our own data set. Thank you very much for sharing.
I am now one step closer to understanding this important concept. It's crucial that I nail this for my thesis.
Very simple, and very helpful. Thanks Matt.
Wow. ... great video... everything has become too much clear... thank you
That is actually a pretty cool explanation. Keep doing that please.
Many Thanks.
We really appreciate this great video. Thank you very very much.
Hello, Thank you so much for the Video, it helped a lot. There is just one question for me that still remains unanswered. I didn't understand why you didn't use the semi-variogram function, instead only using the difference between the values of the observations and then squaring them (you only used the right part of the whole function). In the scatter plot only the squared differences were shown. So for what exactly do I need to use the semi-variogram function? What exactly does the y(h)-value mean and for what do I use it?
I hope you understand my question! And thank you again for this informative video.
Best regards.
Thanks for the video Matt. However, I wonder why you did not multiply the squared difference by the number of pairs and then by 0.5 as you explained in the formula? Could you please explain this in another video? It is not clear which were the lags you used here. Thanks in advance.
Bravo Matt i will use your exemple for my students
Wow so simple, big thanks
You are welcome 😊
Hi Matt ! Are you matt who gave training on UDEMY about ENVI software ???
Hi Matt, thank you very much for this video. I just have some questions: 1) So conceptually speaking, the less alike our plotted data is to the semivariogram model, the more likely it is that the phenomenom is of random nature? In other words, spatial autocorrelation != random? 2) random !=stochastic? If I'm not mistaken, a random variable is that which can take a value with no dependence on any other observation. However, stochastic processes, while "unpredictable", can be conditioned by adjacent observations, be it in time or space, am I correct? 3) Finally, so the semivariogram can help us to determine at which distance the phenomena being measured stops being spatially autocorrelated and starts behaving randomly? Thank you very much, and sorry for these many questions!
This is a great explanation. The only thing I dont understand is why do you square the difference instead fo just using the difference? Thanks!
jose holguin you've asked a great question. The squared difference essentially removes the possibility for negative values skewing the variance at a specific distance. This is most important when multiple observation points are the same distance from each other. The corresponding value on the semi-variogram for this distance would be an average of the squared differences. Incorporating negative values when computing the average would skew the variance for these observation points.
Ahhh that makes a lot of sense! Thank you Matt!
Nice Video Thanks Matt
Thank you for your video!
However, I wonder that if there's something wrong this plot. You just plotted the "squared difference" on Y-axis and "distance" on X-axis, this is not really a variogram. Could you please explain the idea behind it?
I think you don't know something about statics math
@@askhinyarasa9066 Please explain!
There is great confusion in the literature on the usage of variogram versus semi-variogram. It mostly surrounds the prefix, 'semi', meaning half. The way that I like to explain it is that variograms represent the total variance between two observation points at a given distance. Semi-variograms, on the other hand, shift the focus to a single point, meaning that the total variance has been divided evenly between each of the paired points. In other words, if focusing on an individual point, then it has half of the variance associated with it for a given distance between two points (i.e, the spatial lag). You would see in the textbook, Spatial Statistics & Geostatistics - Chun and Griffith, and on the ESRI website that they mention semi-variogram, as they are used in applied practices of geostatistical analyses (desktop.arcgis.com/en/arcmap/latest/extensions/geostatistical-analyst/modeling-a-semivariogram.htm). In both places, you'll notice in the formulas the 1/2 that represent the halving that is present in a semi-variogram.
Great video . Please why didn't you divide the squared distance by N and multiply by N. I presume this is a variogram and not a semi- variogram. The vid has helped me lot to generate a semi variogram using excel for my drill hole data.. I compared to results from surpac mining software and its perfect.
Hello Solomon - I'm glad to hear that the output matched that produced by the Surpac Mining software. In regards to your question, there is some widespread debate on terminology. I found this article that provides further context. It can be viewed at link.springer.com/article/10.1007%2Fs11004-011-9348-3?LI=true.
Hello,
Could you please provide me with these data I need them for a class project
My email:
badrouchi.foued@gmail.com
@solomon ansah,
I was also a bit confused. I think he did not multiply by N neither did he divide by N, because N is usually 1 (if you leave out xi-xj or xj-xi).
But... The Formula says " (1/2*N) * Sum(h) " -> Didn't Matt forget to multiply by 1/2 ?
Dear can you provide me with sample excel as template .
Great video! Just wondering. As these points are made of latitude and longitude (coordinates in a non Cartesian coordinate system), does it make sense to use Euclidean distances? Wouldn’t it be needed first to calculate the Cartesian coordinate of the points from their latitude and longitude values? Many thanks!
Hello GranCombo - Thanks for checking out this video. The example is intended to be performed most simply on a flat, planar surface. When one begins calculating distance on Earth's surface, the equations become more complex, because, as you know, the curvature of the Earth must be incorporated in the distance formula. On a Cartesian Plane, Euclidean distance, or straight-line distance, is appropriate. The points in the video are locations on a cartesian plane en.wikipedia.org/wiki/Cartesian_coordinate_system. The example is intended to help individuals better understand what Geostatistical Analyst is doing behind the scenes when it searches for spatial autocorrelation in datasets. You would be absolutely correct that further adjustments to the distance formula would be needed when measuring distance on a spherical object.
Does summation is missing, over Z ?
is it the y-axis is semivariance?
Khadijah Sahdan yes, correct. The x-axis is distance between paired, observation points.
TQ Matt. Means that the y-axis is the squared differences and simply called semivariance? or there are a specific formula to generate semivariance for y-axis. its confusing me to plot the y-axis (semivariance) due to many formula when i search for semivariance.
Could you pls send me excel sample