What you just fabulously explained in 15 lines, takes 4+ blackboards to many Indian teachers to explain even less than that. Thank you so much for sharing your knowledge here.
Thanks. I hadn't realised how environmentally responsible I was being. 😀 I think it really helps to fit everything onto a single sheet of paper so that the whole explanation is visible all at once, so the viewer can easily refer back at any point.
I see now that the MAP estimator is like a weighted version of the ML estimator, where the weights come from prior knowledge of the measurement target. The different conditional distributions fy(y|xi) are “pushed up” or “pushed down” based on the value of the corresponding fx(xi). Of course, provided that all fx(xi) are equiprobable, the MAP estimator reduces to the ML estimator which we commonly see in optimal communications system analysis. I have a question for you, why is it that the equiprobable symbol scheme is considered most optimal? I am inclined to assume that it is because it yields the highest entropy. Also, I would like to know how it is that we ensure equiprobable signaling? Thank you
Excellent question. Yes, when a random source is compressed to its minimal representation (using an entropy achieving codebook) it results in a binary sequence that has equally likely ones and zeros. This video provides more insights: "What is Entropy? and its relation to Compression" ruclips.net/video/FlaJPxP8sd8/видео.html
I found your videos at the right moment, they cover a lot of the basics of my 1st semester master courses. Thank you. A nice topic you could cover that comes up a lot in detection and estimation is the Cramer-Lao Lower Bound
may be i am missing some pre-requisite knowledge because i am confused a little bit. have we inverted the graph of the function here? f(x) is plotted horizontally and x is plotted vertically. But how can there be a different distribution function of ax1,ax2...ax(n) if there is just one input and output?
Um no. x is not plotted vertically. f_Y(y|x) is plotted vertically. This is the density of the random variable Y, given a specific value of the random variable X. This is a different function for each different realisation (value) of X (ie. x_1, x_2, ...).
They are a possible realisation of the random variable X. If X represents binary data, then it would be "searching on a bit by bit basis", but it X represented higher order modulation then it would be on a "symbol" basis.
Do you follow a systematic procedure to construct the explanation process. If so I really hope that you could share this procedure :). Although everything is short I find that the information is delivered clearly with many subtle points and detail carefully summarized. Thank you for your inspiring lecture.
Yes. Good point. It's almost never mentioned. It's not too hard to get an estimate of the receiver noise - by taking measurements when nothing is being sent (of course you need to be able to work out when nothing is being sent!) It's harder to estimate other parameters, such as channel gain. And there's lots of things that are done to make that possible. See eg: "Channel Estimation for Mobile Communications" ruclips.net/video/ZsLh01nlRzY/видео.html
Very helpful video. I have a question there. The MAP is explained as MLE weighed by the probability of the parameter x, and the parameter follows a certain distribution. If X is a continuous random variable, what is the mathematical meaning for f_y(y|x)f_x(x)?
I'm not sure if this is what you're looking for exactly (eg. I'm not sure it has the numerical examples you might be looking for), but I like this book: H. Vincent Poor, “An Introduction to Signal Detection and Estimation”
Hi Iain. Loved your explanation. I wanted to ask a question about MLE. In the plots of x1,x2,xn, When each x1/x2 give a single value for the function, Why does plot exist for x1 when the function takes a single value for x1. Thank You
I think I got what you are saying and there seems to be some gap in your understanding. Let me try to fill that although Ian mentioned it in this video. What you are saying is that for a given value of x, there is only one single value of y through it's distribution f(y/x) but that is not true. Actually, there are SEVERAL different distributions of y depending on the SEVERAL values of x's. So, when Ian says that for a given x, the distribution's center shifts, it is actually a new distribution centered around that given x value. Then comes the concept of a single value from these distributions, now that is y(bar), this is an observation of all the f(y/x) pdf value among all the distributions of y's for those SEVERAL x's. That is the single value that you are thinking of. Hope I was able to answer your question to some degree. :)
The single value is a result of having measured/fixed y (the y bar hat equation). The plot is for all y (ie it’s a function of y not of x1). x1 is just a guess of the true parameter of the Gaussian distribution (proportional to mean). The horizontal axis (independent variable) is y. Also the function, which is a Gaussian, takes more than just a*x1, it also takes in the variance from the noise. To prove to yourself that the function takes y, look at the form of the Gaussian equation, see the y in there?
So it is L(x|y)? - we want to maximize the likelihood of x given the data values y? . So we are in a sense trying to say that we have high likelihood that this data observed could come from or be predicted by this model of x? Where the probability is P(y|x). Maybe you are saying that and I am not picking up on this. I think you might be but I might not be understanding your notation.
I'm not exactly sure what you're asking. The density function is a "density" (as the name indicates). This means you need to integrate it over some range of values, in order to find the probability. The probability of any _exact_ value is zero (since the base has zero width, for a single _exact_ value). See: "What is a Probability Density Function (pdf)?" ruclips.net/video/jUFbY5u-DMs/видео.html
Hi..very helpful video. Kindly assist me understand how I can factor in the concept of consistency of MLE with respect to the graph illustrations? Particularly, I've learned that as n gets large, mean turns to zero as MLE becomes an even more consistent estimate.
Really big thanks for your video!! May you take another video for explaining different pathloss models, such as Okumura-Hata or various COST model in wireless channel?
The term f_X(x) is the density function for the random variable of interest. So, if it is Nakagami distributed, then f_X(x) equals the formula for the Nakagami p.d.f. which you can find in this video: ruclips.net/video/ztpNbE-Vpaw/видео.html
The explanation is great. The only problem is using pen and paper instead of something more comfortable. The page is too small for this amount of writing.
On the other hand, having everything on the one page means you don't need to scroll back and forth through the video to see the links to earlier parts, and I can simply point to the earlier parts while explaining how they link to the later parts (as I'm doing in the thumbnail image). Perhaps it doesn't work so well on small screens ...
What you just fabulously explained in 15 lines, takes 4+ blackboards to many Indian teachers to explain even less than that. Thank you so much for sharing your knowledge here.
Glad it was helpful!
You're the professor I wished I had in my college! Thankyou!!
Happy to help!
Amazing. Make a series of Probabilistic ML Models!
You explain the concept not only very concise way but also in saving paper. I appreciate you for both the topic and the saved paper.
Thanks. I hadn't realised how environmentally responsible I was being. 😀 I think it really helps to fit everything onto a single sheet of paper so that the whole explanation is visible all at once, so the viewer can easily refer back at any point.
@@iain_explains This is very logical :D
The best explanation on ML and MAP! I finally understood them. Thank you!
I'm so glad the video helped, and that you liked the explanation.
I see now that the MAP estimator is like a weighted version of the ML estimator, where the weights come from prior knowledge of the measurement target. The different conditional distributions fy(y|xi) are “pushed up” or “pushed down” based on the value of the corresponding fx(xi). Of course, provided that all fx(xi) are equiprobable, the MAP estimator reduces to the ML estimator which we commonly see in optimal communications system analysis.
I have a question for you, why is it that the equiprobable symbol scheme is considered most optimal? I am inclined to assume that it is because it yields the highest entropy. Also, I would like to know how it is that we ensure equiprobable signaling?
Thank you
Excellent question. Yes, when a random source is compressed to its minimal representation (using an entropy achieving codebook) it results in a binary sequence that has equally likely ones and zeros. This video provides more insights: "What is Entropy? and its relation to Compression" ruclips.net/video/FlaJPxP8sd8/видео.html
Thanks a lot! One of the most simplest explanations on RUclips
Glad it helped!
Dear prof, you're the best
Thanks. Glad you like the videos.
holy what a clear explanation. it ended my 2 day struggle of not getting it in 18 minutes!!!! thank you
Glad it helped!
I found your videos at the right moment, they cover a lot of the basics of my 1st semester master courses. Thank you. A nice topic you could cover that comes up a lot in detection and estimation is the Cramer-Lao Lower Bound
Thanks for your comment. Glad the videos are helpful. And thanks for the C-R suggestion, I'll add it to my "to do" list.
This is the best explanation in the world, thank you !
I'm so glad to hear that you liked the video.
may be i am missing some pre-requisite knowledge because i am confused a little bit. have we inverted the graph of the function here? f(x) is plotted horizontally and x is plotted vertically. But how can there be a different distribution function of ax1,ax2...ax(n) if there is just one input and output?
Um no. x is not plotted vertically. f_Y(y|x) is plotted vertically. This is the density of the random variable Y, given a specific value of the random variable X. This is a different function for each different realisation (value) of X (ie. x_1, x_2, ...).
are the differenct x values you are checking for maximum likelihood each a possible input signal or are we searching on a bit by bit basis?
They are a possible realisation of the random variable X. If X represents binary data, then it would be "searching on a bit by bit basis", but it X represented higher order modulation then it would be on a "symbol" basis.
I finally get the difference between the two! Thank you!
I'm glad it helped.
very precisely explained.
Glad you liked it
This is a fantastic video that answered so many questions I had while working through my academic coursework. Thank you so much for uploading!
You're very welcome!
Do you ever have to do a rehearsal beforehand ? I see the explanation is quite smooth.
Thanks, I put quite a bit of thought into how to explain things in the best way.
Do you follow a systematic procedure to construct the explanation process. If so I really hope that you could share this procedure :). Although everything is short I find that the information is delivered clearly with many subtle points and detail carefully summarized. Thank you for your inspiring lecture.
Beautiful explanation. Very helpful.
Glad it was helpful!
Very intuitive explanation! 🙏
I'm glad you liked it.
woow so good to understand the MAP and ML
Glad it was helpful.
Does demodulating using ML require channel state information? (i.e. an estimation of the AWGN noise variance)
Yes. Good point. It's almost never mentioned. It's not too hard to get an estimate of the receiver noise - by taking measurements when nothing is being sent (of course you need to be able to work out when nothing is being sent!) It's harder to estimate other parameters, such as channel gain. And there's lots of things that are done to make that possible. See eg: "Channel Estimation for Mobile Communications" ruclips.net/video/ZsLh01nlRzY/видео.html
Can this be applied in marketing?
i beleive the title of the video is genuinely true.
Thanks. It was a comment someone else had made about the video, so it's good to know that you also agree.
Very good explanation with right amount of details and relevant examples. Thanks a million.
Glad you liked it
Outstanding video! You sir have saved the day, again!
I'm so glad the video helped. It's great to read these comments, and know that my videos are making a difference for people. Thanks.
Thanks for your super simple explanation. I now understand how to apply it.
Glad it was helpful!
best explanation of the ML and MAP on youtube
thank you
Thanks. Glad it helped!
great explanation!
Glad it was helpful!
Thanks so much for this video, explained it much better with my lecturer!!!
I'm so glad it helped!
Great explanation...
Glad you liked it.
Definitely "Best explanation on RUclips" !! ❤ Thanks a lot Sir.
Thanks. I'm glad you think so. And I'm glad it was helpful.
Very helpful video. I have a question there. The MAP is explained as MLE weighed by the probability of the parameter x, and the parameter follows a certain distribution. If X is a continuous random variable, what is the mathematical meaning for f_y(y|x)f_x(x)?
It equals the joint pdf f_{X,Y}(x,y)
Thanks for the Video, is there any reference ( book, ...) for that, particulary for numerical solution?
I'm not sure if this is what you're looking for exactly (eg. I'm not sure it has the numerical examples you might be looking for), but I like this book: H. Vincent Poor, “An Introduction to Signal Detection and Estimation”
Beautiful explanation sir, thank you!
Most welcome!
I'm currently learning about autoencoders and it's based on this topic! very helpful and intuitive. Thank you!
Glad it was helpful!
sir, how to estimate channel in case of correlated rayleigh fading channel. for example y1=hx_1 + hx_2 +n_1, y2=hx_1 + hx_2 +n_2.
n_1 and n_2 are white gaussian noise with different variances.
Thank you so much! That's clear. One question: for MAP, what's f_X(x)?
It's the probability density function for the variable X.
Is AwGN the same as normally distributed err or bias?
This video should help: "What is White Gaussian Noise (WGN)?" ruclips.net/video/QfUQMzHfbxs/видео.html
Hi Iain. Loved your explanation. I wanted to ask a question about MLE. In the plots of x1,x2,xn, When each x1/x2 give a single value for the function, Why does plot exist for x1 when the function takes a single value for x1. Thank You
Sorry, I'm not sure what you're asking.
I think I got what you are saying and there seems to be some gap in your understanding. Let me try to fill that although Ian mentioned it in this video.
What you are saying is that for a given value of x, there is only one single value of y through it's distribution f(y/x) but that is not true. Actually, there are SEVERAL different distributions of y depending on the SEVERAL values of x's. So, when Ian says that for a given x, the distribution's center shifts, it is actually a new distribution centered around that given x value. Then comes the concept of a single value from these distributions, now that is y(bar), this is an observation of all the f(y/x) pdf value among all the distributions of y's for those SEVERAL x's. That is the single value that you are thinking of.
Hope I was able to answer your question to some degree. :)
The single value is a result of having measured/fixed y (the y bar hat equation). The plot is for all y (ie it’s a function of y not of x1). x1 is just a guess of the true parameter of the Gaussian distribution (proportional to mean). The horizontal axis (independent variable) is y. Also the function, which is a Gaussian, takes more than just a*x1, it also takes in the variance from the noise. To prove to yourself that the function takes y, look at the form of the Gaussian equation, see the y in there?
Liked and subbed, very clear and accessible explanation of a concept that made no sense to me as it was presented in my class
I'm so glad it helped!
So it is L(x|y)? - we want to maximize the likelihood of x given the data values y? . So we are in a sense trying to say that we have high likelihood that this data observed could come from or be predicted by this model of x? Where the probability is P(y|x). Maybe you are saying that and I am not picking up on this. I think you might be but I might not be understanding your notation.
Yes, that's right.
Wow! Amazing way of explaining these complex ideas.
Glad it was helpful!
Gran explicación.. Gracias por subir el video.
My pleasure. I'm glad you liked it.
at a particular point in the density function, the probability is zero right? I'm a little confused.
I'm not exactly sure what you're asking. The density function is a "density" (as the name indicates). This means you need to integrate it over some range of values, in order to find the probability. The probability of any _exact_ value is zero (since the base has zero width, for a single _exact_ value). See: "What is a Probability Density Function (pdf)?" ruclips.net/video/jUFbY5u-DMs/видео.html
@@iain_explains oh sorry, im wrong. Thank you so much sir.
excellent video!!
Thank you so much Sir Iain. You made my day. Great explanation regrading MAP and ML. Hats off Iain
Glad you liked it!
best in the game 🙌🙌
Glad you liked it.
MAP starts at 10:35
Sir which text book should we follow for detection and estimation theory?
I like the book: H. Vincent Poor, "An Introduction to Signal Detection and Estimation", Springer.
Thank you for the great video
Glad you liked it
Incredibly helpful. Thank you!
You're very welcome!
Hi..very helpful video. Kindly assist me understand how I can factor in the concept of consistency of MLE with respect to the graph illustrations?
Particularly, I've learned that as n gets large, mean turns to zero as MLE becomes an even more consistent estimate.
Really big thanks for your video!!
May you take another video for explaining different pathloss models, such as Okumura-Hata or various COST model in wireless channel?
Good suggestions thanks. I've added them to my "to do" list.
Great video, I have a question, if the variable is its self distributed with Nakagami distribution. Then how can we compute the MLE and MAP?
The term f_X(x) is the density function for the random variable of interest. So, if it is Nakagami distributed, then f_X(x) equals the formula for the Nakagami p.d.f. which you can find in this video: ruclips.net/video/ztpNbE-Vpaw/видео.html
@@iain_explains Thankyou very much
What a clear explanation!
Thank you so much.
Glad it was helpful!
Decent video! Thanks.
Glad you liked it!
Very helpful
The explanation is great. The only problem is using pen and paper instead of something more comfortable. The page is too small for this amount of writing.
On the other hand, having everything on the one page means you don't need to scroll back and forth through the video to see the links to earlier parts, and I can simply point to the earlier parts while explaining how they link to the later parts (as I'm doing in the thumbnail image). Perhaps it doesn't work so well on small screens ...
Amazing, this was so clear to understand. Thank you very much!!!
Glad it was helpful!
ty
It's very helpful thanks sooooo much
You're welcome! I'm glad it helped.
Can you please make a video on softmax regression?
Thanks for the suggestion, but I'm not familiar with it, sorry. I'll have to give it some thought.
Thank you very much.
You are welcome!
This was really informative! Thanks.
Glad it was helpful!
Amazing video, thanks!
Glad you liked it!
I can't understand why the bell curve is shifting for every value of x.
Thank you so much!!! very helpful video
Glad it was helpful!
Thank you. you explained it clearly, just what I was looking for.
Glad it was helpful!
Can we have a video sometime on mmse and irc receivers ?
Regards,
Amit
Thanks for the suggestion. I've added them to my "to do" list. I'll see what I can do (it's starting to become a long list).
Good explanation of a lot of concepts in wireless communication. I'm watching your video for the preparation of QE. Hope I can pass!
Glad it was helpful! Good luck!
Nice
Thank you!!!
Thank you
Excelent explanation! Thank you very much :)
Glad it was helpful!
if x is vector ?
Yes, it all follows though for vectors.
Amazing
Glad you liked it.
When your typing the screens becomes blurry because paper is moving. Please stabilize the paper.
I think this video could be improved by providing a concrete example, also it's really mathy without much intuitive explanation
didn't get the idea
Iain you have offered me shelter in a howling wind, thank you - I can leave the library and go home now xo love from rory
That's great. I'm so glad you found the video helpful. Hope you mange to stay out of the wind.
Great, thank you
Glad you liked it!