Causal Effects via Propensity Scores | Introduction & Python Code

Поделиться
HTML-код
  • Опубликовано: 6 янв 2025

Комментарии • 41

  • @ShawhinTalebi
    @ShawhinTalebi  2 года назад +3

    More on Propensity Scores 👇
    📰Read More: towardsdatascience.com/propensity-score-5c29c480130c?sk=45f0ec6803eba962c0d2d0162185741d
    💻Example Code: github.com/ShawhinT/RUclips-Blog/tree/main/causality/propensity_score

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад +1

      More in this series 👇
      Intro to Causal Effects: ruclips.net/video/BOPOX_mTS0g/видео.html
      Do-operator: ruclips.net/video/dejZzJIZdow/видео.html
      DAGs: ruclips.net/video/ASU5HG5EqTM/видео.html
      Regression techniques: ruclips.net/video/O72uByJlnMw/видео.html
      Intro to Causality: ruclips.net/video/WqASiuM4a-A/видео.html
      Causal Inference: ruclips.net/video/PFBI-ZfV5rs/видео.html
      Causal Discovery: ruclips.net/video/tufdEUSjmNI/видео.html

  • @journey-in-pixels
    @journey-in-pixels Год назад

    Thanks for taking time to put this video, appreciate the working example in addition to theory.

  • @ifycadeau
    @ifycadeau 2 года назад +2

    Great video as always! Looking forward to more. 😃

  • @Tristana_21
    @Tristana_21 2 дня назад

    I have some problems when using pickle to read the 'df_propensity_score.p' file on your github, seems that my new version is not compatible with the pickle you used to save files. Can I get a new .p file or .csv file, thanks a lot!

    • @ShawhinTalebi
      @ShawhinTalebi  2 дня назад

      Thanks for raising this! Would you mind submitting an issue on the GitHub repo?
      github.com/ShawhinT/RUclips-Blog/issues

  • @youtubeuser4878
    @youtubeuser4878 6 месяцев назад

    Thank you for this. Can you please do a similar video or series on uplift modeling. There are lots of videos and literature explaining the concept but not enough examples of practical application.

    • @ShawhinTalebi
      @ShawhinTalebi  6 месяцев назад

      Great suggestion! I'll add that to the list :)

  • @programmingwithjackchew903
    @programmingwithjackchew903 2 года назад +1

    hi sir what if i totally don't know the variables and want to measure the causal effect sub kpi to main kpi, in this case, I don't know what sub kpi should be classified as confounder . Can you gimme suggestion

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад

      That sounds interesting! I might need more content before giving a suggestion. Feel free to email me here: shawhint.github.io/connect.html

    • @kssrr
      @kssrr 6 месяцев назад +1

      Causal inference is design based, not model-based. If you don‘t know about your data generating process and don’t know what variables could confound your relationship, you can not estimate causal effects

  • @christopherosariemeniyangb76
    @christopherosariemeniyangb76 6 месяцев назад

    Thanks for explaining Propensity Score Matching. How do I get the complete python code (Jupyter notebook) to run the nearest neighbourhood matching method. I already have my data

    • @ShawhinTalebi
      @ShawhinTalebi  6 месяцев назад

      A notebook version of the example code is available here: github.com/ShawhinT/RUclips-Blog/blob/main/causality/propensity_score/propensity_score_example.ipynb

    • @christopherosariemeniyangb76
      @christopherosariemeniyangb76 6 месяцев назад

      @@ShawhinTalebi thanks a lot. But I am only familiar with Jupyter notebook IDLE

  • @surajjha6193
    @surajjha6193 Год назад

    While working with observational data, how we decide how much sample size for results to be statistically significance?

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад

      Good question. I haven't come across any special considerations for observational study sample sizes. Traditional sample size determination methods should be a good start.

  • @varun8833
    @varun8833 8 месяцев назад

    Hey, is there any dataset that you could recommend, where I can work on this method?

    • @ShawhinTalebi
      @ShawhinTalebi  8 месяцев назад

      Here's the original dataset I used for the example code: archive.ics.uci.edu/dataset/20/census+income
      A modified version is available at the GitHub repo: github.com/ShawhinT/RUclips-Blog/tree/main/propensity_score

  • @emilyqian8905
    @emilyqian8905 9 месяцев назад

    do you need to test if the result is statistically significant?

    • @ShawhinTalebi
      @ShawhinTalebi  9 месяцев назад

      Good question. This depends on your particular use case. However, broadly speaking, if computing the statistical significance is helpful then I'd say do it.

  • @FallenJakarta
    @FallenJakarta Год назад

    thank you for the great video! i have question, as you mention the original data i found the original data is different from what you use in this video. why did you make feature enginering to the data? please answer my question, thank you in advance

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад

      Thanks for the good question. The main reason is some variables were originally string types which cannot be used directly with the given library. Additionally, the use of boolean variables makes the example a bit easier to follow IMO.

  • @shadiaelgazzar9195
    @shadiaelgazzar9195 Год назад

    thanks for your amazing video ,but i have a question can i use this line df=pd.read_csv("dataseta-casual90.csv"
    to load dataset instead of
    df=pickle.load(open('df_prospenty_score.p','rb'))

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад

      You may need to do some additional data prep after reading in the .csv file, but that should work!
      The library I use here works with pandas dataframes.

    • @shadiaelgazzar9195
      @shadiaelgazzar9195 Год назад

      @@ShawhinTalebi additional data prep like what? I used
      The pickle and give me error

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад

      @@shadiaelgazzar9195 if you are reading in a raw csv with pandas you may need to do data prep like: checking dtypes, handling missing values, looking out for outliers, etc.

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад

      @@shadiaelgazzar9195 what error are you getting?

    • @shadiaelgazzar9195
      @shadiaelgazzar9195 Год назад

      @@ShawhinTalebi i used this line tp load the dataset
      df=pd.read_csv("dataseta-casual90.csv")
      when i run the file it gives me:
      Exception: Propensity score methods are applicable only for binary treatments

  • @pachakhan3588
    @pachakhan3588 Год назад

    Is there any email that I can contact you? I am working on PSM!

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад

      You can email me here: shawhintalebi.com/contact/

  • @mauriciomandujano5281
    @mauriciomandujano5281 9 месяцев назад

    Thank you for such digestible and concise video!
    Do you mind if I shoot you an email with a couple q's?

    • @ShawhinTalebi
      @ShawhinTalebi  9 месяцев назад +1

      Happy to help! Feel free to email me here: www.shawhintalebi.com/contact

    • @mauriciomandujano5281
      @mauriciomandujano5281 9 месяцев назад

      Thank you! Will do later next week!
      @@ShawhinTalebi

  • @EV4UTube
    @EV4UTube Год назад

    This is really wonderful, but I'd like to offer a suggestion. The suggestion has to do with cadence (i.e., the musical quality of speech and where you place emphasis in the sentence).
    When you start out the video, you are speaking in a fairly natural way and you use emphasis in an appropriate way. In other words, if you emphasize what needs to be emphasized. However, by the half-way point in the video (especially in the section about programming), you slip into an ossified, mechanical cadence which is repetitive, static, and which is divorced from what needs to be emphasized. The emphasis is no longer falling on the term / idea which needs emphasizing, you're just emphasizing because you happen to be near the end of the sentence. It begins to feel like you're trying to hypnotize us - like a tour guide who has offered the tour too many times to be awed by what is being shown. The programming part is really important - please don't drone thru it. Use cadence and emphasis to signal to us that you're engaged, that this is important, and that we should be paying attention.
    To me, cadence and emphasis are important signalers in a presentation - that is why parents (when reading to children) use cadence to amplify what is happening in the story. If you're placing emphasis on words which don't need them, you are effectively confusing or misdirecting the listener. It is like reading a story to a child, but using a spooky-voice when telling a story about a beach trip and a happy voice when talking about monsters under the bed. If the programming is important, don't use a cadence reminiscent of someone snoring.
    In your case, your cadence during these hypnotic periods is characterized by low, flat, fairly rapid rolling speech terminating in a loud WORD trailed by a deflation. It is almost like you are bored of this part or want to get thru it quickly and that it is not very important. blah-blah-blah-BLAHhh. blah-blah-blah-BLAHhh.
    Please embrace that your cadence is a powerful tool for you to wield wisely. Don't let your cadence confess to us that you're bored. As silly as it sounds, your cadence helps students retain information.

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад +1

      Thanks for the feedback. Developing my communication skills is a big focus for me, and cadence is a key part of that skill set. I’ll definitely be more critical of that aspect for future videos.

    • @EV4UTube
      @EV4UTube Год назад

      @@ShawhinTalebi Thank you. I worried my suggestion would come across as harsh (which was not my intention). You seem to be more mature than most and may have seen something of value in the suggestion. Thank you.

    • @ShawhinTalebi
      @ShawhinTalebi  Год назад +1

      Thanks. Growth is important to me, so I value (and often prefer) critical feedback.

  • @EV4UTube
    @EV4UTube 7 месяцев назад

    I'll offer some constructive criticism regarding the cadence of your delivery. Obviously, you can take it or leave it. When starting to speak a sentence you speak fairly rapidly with a low volume, and low variability, but near the end of each sentence, you select one of the penultimate words upon which you will vastly increase the volume and slow the pace. Then start the next sentence very low and rapid and again end slow and loud. The challenge is that it is not as though you placed an emphasis on a term that is particularly worthy of emphasis - it just happens to be one of the words at the end of a sentence. It is just a rote pattern played over and over and over. It reminds me of a tour-bus operator who has been giving the same tour for the last decade and is bored to tears. For me, personally, I find this pattern very irritating and a little like Chinese water torture and since the emphasis is placed on terms not requiring emphasis, my attention is diverted to low-information terms.
    This is probably the natural way you speak and not something you can change. But just compare you delivery to the delivery of news casters, actors, comedians, any other entity that presents information verbally - I would argue that this terminal-spiking patter pattern is not optimal for the listener.

    • @ShawhinTalebi
      @ShawhinTalebi  7 месяцев назад

      That's good feedback. Were there specific points from this video where this pattern stood out most?