Logit model explained: regression with binary variables (Excel)
HTML-код
- Опубликовано: 7 авг 2024
- How to perform regression analysis when your dependent variable is categorical or binary? How to predict whether the borrower repays their loan, forecast the next recession, or estimate whether a student passes or fails the exam? The logit model (or logistic regression) comes to the rescue! Today we will learn how to estimate logit model parameters, coefficient standard errors, and how to forecast with it using a real-world example of credit scoring data. Econometrics is easy with NEDL!
Please consider supporting NEDL on Patreon: / nedleducation
You can find the spreadsheets for this video and some additional materials here: drive.google.com/drive/folders/1sP40IW0p0w5IETCgo464uhDFfdyR6rh7
Please consider supporting NEDL on Patreon: www.patreon.com/NEDLeducation
Невероятно полезный канал, провел много исследований в excel благодаря Савве!
Thank You infinity much !
really helped !
i would like you to keep up such good work !
Thank you so much and well done! I love your content and your Excel approach. Video suggestion: statistical classification methods (supervised and unsupervised)
Hi Corrado, and thanks so much for the feedback! Great suggestions, might be a little outside of my comfort zone, but I will keep a note on them :)
Very instructive to understand the dynamics of something you use a lot of times in python and don´t know how the results are calculated
Exactly the motivation behind this video :) Logit/probit are actually much easier to estimate and interpret than many people think and working it through in Excel is very much possible and quite rewarding in that!
Could you do a video on conditional logistic regression within excel?
Congratulations for the clear and instructive video.
How would that work with a dependent variable that rather than binary (0, 1) would be defined on more than two categories ( e.g. 1,2,3)?
Hi Fernando, and thanks for the excellent question! To estimate a model like that, you can use ordered logit or ordered probit. The technique is the same in spirit but slightly different in implementation, and I might touch upon that in a later video someday!
@NEDL When you calculated the standard error, do we need to divid by square root of the sample size?
Hi. Thanks for the concise explanation. Can you do a video on how to estimate the parameters when the dependent variable has more than two categories or how to perform a Multinomial Regression in excel?
maybe convert the multi category to one-hot coding?
Thanks for your great video, I have two questions please . First, How do we know that our data contains sufficent number of "1" values. Meaning, what is the desired number of "1" and "0". how did you come tot he conclusion that 25% are ok. Suppose I collect 10,000 observations, and only 12% of them are "1" value, how do I know this is sufficient or not? is there any indepednet methodology to relax such dilema. Second, I have this dillema when I want t insert a dummy explantory variable in a simple ols regression. Is there any apriori methodology to know if it worth inserting the dummy variable based on the numbr of "1" values...or in this case we leave it to the p-value will do the work for determining if the dummy variable has any impact.
Hello, another great video!!!! I have one question, is it ok to use the Logit model on variables that are not categorical or binary? for instance, I want to know what variables impact the market share of a company, market share data is bounded by 0 and 1 but is not binary, is it ok to estimate a Logit model in this case? if not, what kind of model shoud I use?
Thanks in advance and keep doing your excellent videos!!!!
Hi Victor, and thanks for the excellent question! For this, a censored regression model such as Tobit would be ideal. I have got a video on that here: ruclips.net/video/QS3OAYML2nM/видео.html
Hello @NEDL, I search through google drive , but there is no file for this video ? if you have, can I have it to understand better your eexplanation. thanks
Hi Moch, please check the NEDL_Probit file as it includes both binary choice model regressions.
Can you explain a little bit more why the variance of the coefficient equals to (XTWX)-1?
Can you please help with the log liklihood gets the max value using solver,what's the logic behind it
Hi Akash, and thanks for the question! Keeping things simple, the logic of maximising log-likelihood is to get the best possible fit of the model (the best possible explanation of what happens) to the data by varying the parameters. This is due to the probability density function representing the derivative of the distribution function and being interpretable as likelihood. Log-likelihood is maximised instead of simple likelihood as manipulating a sum is easier computationally than manipulating a product (log converts a product to a sum of logs). Hope it helps!
@@NEDLeducation understood,
It will be very helpful if you can make a video for these approximation of maximum likelihood
Dear Sava, sorry for bothering but I am wondering which is the alpha here, for the p-value' s 2 tail test: is that set at 0.05, as it is conventionally ? Plus, I am not sure about the role of the constant here...Sorry for my very begin level questions
Hi Marina, and many thanks for the questions! The significance testing for coefficients here works exactly the same way, a p-value below 10%, 5%, or 1% means the relationship between an independent variable and the binary variable is significant at the respective level. The constant here can be interpreted as the odds ratio (for logit) or a z-stat (for probit) for the binary variable if all other independent variables are zero. Most of the time, we are not necessarily interested in the statistical significance of the constant. Hope this helps!
@@NEDLeducation Aaa ok ok! Constant is maybe what we call intercept b0, whose probability is calculated through inverse logit P= e^b0 / (1+e^b0), when x=0! I think now everything it is clear EVEN to me! THANK YOU!
Hi sir can u suggest any book reg Prediction Loan default using excel
Hi Ramesh, in terms of how regulators view prediction of loan defaults in various models I would suggest this source: www.bis.org/ifc/events/ifc_8thconf/ifc_8thconf_4c4pap.pdf
@@NEDLeducation sir thanks for your reply and article is informative...thank you sir.
Dear Sir. This excel file is not avilable in your files directory
Hi Salar, just added the Logit spreadsheet. Hope it helps!
hey where is this data analysis tab , ur face hid it in video
Hi, and thanks for the question! It is on the top right of the Data tab. If you have not got this add-in installed, you can go Home -> Options -> Add-ins -> Excel add-ins -> Go and tick the Data analysis in the appearing menu (similar to how you enable Solver). Hope this helps!
video is blurry