How to Use Great Expectations for Data Quality Checks with Airflow

Поделиться
HTML-код
  • Опубликовано: 1 дек 2024

Комментарии • 25

  • @joaovitoralmeidaaraujobelc6993
    @joaovitoralmeidaaraujobelc6993 6 месяцев назад +1

    Very simple video with excellent explanation and not overcomplicating things. Thanks for sharing it!

  • @roopashastri9908
    @roopashastri9908 4 месяца назад +1

    Great explaination!Any thoughts on how we can save the great expectation results in the Database?

    • @thedataguygeorge
      @thedataguygeorge  3 месяца назад

      I would configure the expectation results storage location to be a bucket and then have a pipeline that takes the expectation results and stores them in a database

  • @BubbaB2323
    @BubbaB2323 Год назад +1

    Very useful bud. Thank you.

    • @thedataguygeorge
      @thedataguygeorge  Год назад +1

      No problem, do it all for you!

    • @BubbaB2323
      @BubbaB2323 Год назад +1

      @@thedataguygeorge will reach out on the side to talk shop if that's cool, loving your work.

    • @thedataguygeorge
      @thedataguygeorge  Год назад

      Always cool!

  • @VarunDeep-x5j
    @VarunDeep-x5j Месяц назад

    Hi - Thanks for sharing this video. When I tried running it, I kept getting errors like "raise gx_exceptions.DataContextError(
    great_expectations.exceptions.exceptions.DataContextError: expectation_suite strawberry_suite not found". During a deep dive into the code, i found there is a condition "self.expectations_store.has_key(key)". I am missing something concerning the store.?

  • @roopashastri9908
    @roopashastri9908 4 месяца назад +1

    On failure of great expectation validation, would this raise alerts?

    • @thedataguygeorge
      @thedataguygeorge  4 месяца назад

      Yes as long as you have Alerts configured for your Airflow DAG

  • @roopashastri9908
    @roopashastri9908 4 месяца назад +1

    Also how can we automate the threshold changes with the changing business needs?

    • @thedataguygeorge
      @thedataguygeorge  3 месяца назад

      You'd want to have another helper pipeline that checks for changing business requirements and then either alerts you or makes adjustments

  • @maheshbhatm9998
    @maheshbhatm9998 10 месяцев назад +1

    Thank You

    • @thedataguygeorge
      @thedataguygeorge  10 месяцев назад

      No worries, let me know if there's any other videos you'd like to see!

  • @karangupta_DE
    @karangupta_DE Год назад +1

    Hi, do you prefer soda or great expectations?

    • @thedataguygeorge
      @thedataguygeorge  Год назад +1

      I've only recently started using Soda so I'm not sure if I have enough experience to form a definitive opinion, but I have definitely enjoyed the UX much more so far, SCL is a lot more human readable than great expectation "expectations" imo

  • @roopashastri9908
    @roopashastri9908 4 месяца назад

    Also can we include more than one expectation in the expectation file?

  • @LucasGomes-q9t
    @LucasGomes-q9t 4 месяца назад

    On minute 3:57 how could create the default file of great_expectations? I created the json but I got a blank one.

    • @thedataguygeorge
      @thedataguygeorge  3 месяца назад

      You then just fill out that json with all the expectation info you want!

  • @criistiina71
    @criistiina71 4 месяца назад

    May I know if, we can create our own expectations. If I have a expectations who is not in the script that is on the documentation Could I create my own one?For Example, if one column is created from a formula and used a diferent database Could I create a expectation of who makes the math?
    Hi, from Colombia :)

    • @thedataguygeorge
      @thedataguygeorge  3 месяца назад +1

      Definitely can create your own expectations, honestly one of the best features of great expectations!

    • @criistiina71
      @criistiina71 3 месяца назад

      @@thedataguygeorge Do you have a video-tutorial where you’re teaching how to connect GX with Databricks? 😊